dariushghasemi / kidneyInCHRIS

Genetic determinants of Kidney function in the CHRIS study
1 stars 0 forks source link

Regional association plots #10

Closed dariushghasemi closed 1 year ago

dariushghasemi commented 1 year ago

To draw regional association plot for the lead variant of each replicated loci, we need to update the LocusZoom SQLite database to show the recombination rate and it hotspots in the plot for SNP with position in build 38 (GRCh38). If the SNPs positions are in build 37 or before, can skip this additional step.

  1. Download the lifted map files (from build 37 to build 38) containing recombination rate. http://csg.sph.umich.edu/locuszoom/download/recomb-hg38.tar.gz

  2. Prepare the merged recombination rate file to be added to LZ SQLite database for build 38.

    # Change the delimiter to be tab separated ("\t")
    # Drop '"chr" from the beginning of each chromosome number
    # Change the columns header for consistency: 
    # Sort the data based on chromosome number: 'chrom' to  'chr'; 'recomb_rate' to 'recomb'; 'pos_cm' to 'cm_pos'
    cat genetic_map_GRCh38_merged.tab.downloaded |   \
     awk 'OFS="\t" {print }' |   \
     sort -g -k1 |   \
     sed '2,$  {s/chr//g}' |   \
     sed 's/chrom/chr/g' |   \
     sed 's/recomb_rate/recomb/g' > genetic_map_GRCh38_merged.tab
  3. Then need to run the python script to create the table and add it to SQLite database:

    
    #!/usr/bin/env sh

LocusZoom directory on Eurac server

app=/usr/local/stow/locuszoom-1.4

Downloaded lifted over merged map file

map=~/projects/gwas/pairways_LD/recomb-hg38

python2 ${app}/bin/dbmeister.py \ --db ${app}/data/database/locuszoom_hg38.db \ --recomb_rate ${map}/genetic_map_GRCh38_merged.tab


4. Let's check and see if anything actually got inserted into the table:
```bash
echo 'select * from recomb_rate limit 10' | sqlite3 data/database/locuszoom_hg38.db

# Or simply check the header
sqlite> .schema recomb_rate

sqlite> SELECT * FROM recomb_rate limit 10;
  1. If the recomb_rate is already added to the database, it's time to draw the regional association plots with recomb_rate. Dariush | 06-Feb-23
dariushghasemi commented 1 year ago

If one wants to ensure that they have successfully created recomb_rate table and added it to the LZ SQLite database by running the abovementioned step 4, it's wise to check the output log file. The content of the log file would be exactly like this:

Loading recombination rates table from file ~/recomb-hg38/genetic_map_GRCh38_merged.tab.headered.. Dropping table recomb_rate from database /usr/local/stow/locuszoom-1.4/data/database/locuszoom_hg38.db.. Creating table recomb_rate.. Loading ~/recomb-hg38/genetic_map_GRCh38_merged.tab.headered into table recomb_rate.. Creating index for table recomb_rate on columns chr,pos..

dariushghasemi commented 1 year ago

The created LocusZoom plots for the replicated loci were converted to .png file in shell. For instance, this command is converting the LZ plot for the top SNP at PDILT locus from pdf to png: pdftoppm -r 300 -f 1 -png ~/projects/gwas/LZ_plots/07-Feb-23_PDILT_230207_16_20381010.pdf 09-Mar-23_PDILT