statgen / locuszoom-standalone

Create regional association plots from GWAS or meta-analysis
http://locuszoom.org/
57 stars 20 forks source link

Need recombination rates for GRCh38 #1

Open welchr opened 7 years ago

welchr commented 7 years ago

So far I've been unable to find recombination rates for GRCh38. May need to liftover the previous GRCh37 rates. The GRCh37 rates currently in locuszoom were generated by lifting over GRCh35:

This folder contains a build GRCh37 genetic map.

The map was generated by lifting the HapMap Phase II genetic map from build 35 to GRCh37. The original map was generated using LDhat as described in the 2007 HapMap paper (Nature, 18th Sept 2007). The conversion from b35 to GRCh37 was achieved using the UCSC liftOver tool.

Once the liftOver was completed, the map was inspected for regions in which the genome assembly had be rearranged. Such rearrangements result in a dip in the lifted genetic map (i.e. a negative recombination rate). These regions were removed by setting the recombination to zero within, and for 50 SNPs either side of, such regions. These regions are fairly rare, and this method removed a total of 2013 SNPs from the following chromosomes.

chr1: 224 SNPs chr2: 209 SNPs chr3: 104 SNPs chr4: 103 SNPs chr7: 547 SNPs chr9: 292 SNPs chr10: 101 SNPs chr13: 208 SNPs chr15: 101 SNPs chrX: 124 SNPs.

All other chromosomes did not have any SNPs removed by this method.

Adam Auton 08/12/2010

pjvandehaar commented 6 years ago

UCSC Track Search has some deCODE-based recombination rate built in 2010 for GRCh36/hg18 that are apparently lifted to GRCh37. Nothing on GRCh38. (paper, related webpage)

welchr commented 6 years ago

Beagle appears to have created an hg38 recombination map:

http://bochet.gcc.biostat.washington.edu/beagle/genetic_maps/

Unfortunately, I can't find any documentation on how this was produced (is it a liftover of the previous maps?) We would likely need to ask the Beagle authors about this before including.

andhamel commented 5 years ago

After downloading the .map files for GRCh38, the README.txt files does indeed say that the files were liftedover from GRCh37.

andhamel commented 5 years ago

However, the files from Beagle do not provide the recombination rate, only the cm_pos, chr, and position. Has anyone else found anything?

janetw commented 5 years ago

I too am looking for a genetic map from GRCh38.

abought commented 5 years ago

Although the python version of LocusZoom does not support build 38, we have been working on adding this to our browser-based version (makes plots from your own tabixed GWAS data).

If your goal is just to explore your own data, perhaps this would be of use? Feedback is welcome. Here's a link to the beta version: https://abought.github.io/locuszoom-tabix/

naumanjaved commented 5 years ago

However, the files from Beagle do not provide the recombination rate, only the cm_pos, chr, and position. Has anyone else found anything?

You can take the difference in cM and Mb between adjacent rows to get the rate, and then reformat the file into shapeit format.

welchr commented 4 years ago

These are the recombination rates we are using for hg38:

http://csg.sph.umich.edu/locuszoom/download/recomb-hg38.tar.gz

You will need to update your database file, see README.md for information on how to do that.

We may at some point release a new bugfix release, at which point this will be included by default.

mengyuankan commented 4 years ago

These are the recombination rates we are using for hg38:

http://csg.sph.umich.edu/locuszoom/download/recomb-hg38.tar.gz

You will need to update your database file, see README.md for information on how to do that.

We may at some point release a new bugfix release, at which point this will be included by default.

Hey, Thanks for providing the link. I downloaded the packet and got the file genetic_map_GRCh38_merged.tab that contains the column of the recombination rate (recomb_rate). I happened to see 9 positions with a recombination rate >100:

chrom pos recomb_rate pos_cm chr11 7695615 156.9 16.155251 chr1 13376798 1058 27.590843 chr2 3026265 126.9 5.086911 chrX 13445766 107.8 20.97507 chrX 13446336 107.8 21.03653 chrX 116345108 105.4 117.142462 chrX 116346064 105.3 117.243244 chrX 146506432 105.9 166.209401 chrX 146507816 102.8 166.355973

I haven't looked into the codes used to generate this column but was wondering if you happen to know this problem? Thanks!

hyanwong commented 3 years ago

These are the recombination rates we are using for hg38:

http://csg.sph.umich.edu/locuszoom/download/recomb-hg38.tar.gz

You will need to update your database file, see README.md for information on how to do that.

We may at some point release a new bugfix release, at which point this will be included by default.

There seem to be 2 problems with the genetic_map_GRCh38_merged.tab file downloaded from this link, on lines 3388137-8 and 3389520-1 (the positions jump back to 0). I wonder if some of these are actually values for chrY?


% grep '155699751' recomb-hg38/genetic_map_GRCh38_merged.tab  -A 2 -B 2 -n 
3388135-chrX    155687184   0.001304    180.837739
3388136-chrX    155697920   0.001092    180.837753
3388137:chrX    155699751   0   180.837755
3388138-chrX    233451  1.135   0.0
3388139-chrX    238008  1.141   0.00517

grep '2777299' recomb-hg38/genetic_map_GRCh38_merged.tab  -A 2 -B 2 -n 
3389518-chrX    2774556 7.335   20.820702
3389519-chrX    2776199 4.58    20.832754
3389520:chrX    2777299 0   20.837792
3389521-chrX    155739376   0.674   0.0
3389522-chrX    155739965   0.6585  0.000397
welchr commented 3 years ago

Looks like where they switch back to 0 are the starting positions of the two PAR regions on chrX. In the original files, they are plink.chrX_par1.GRCh38.map and plink.chrX_par2.GRCh38.map.

hyanwong commented 3 years ago

Ah! That would explain it. Is there any sensible solution? Should these be listed as separate "chromosomes" somehow? How is it done in the original (non-lifted over) version?

welchr commented 3 years ago

Unfortunately I think LocusZoom requires chrX or chr23, it probably wouldn't work with differently named versions of chrX (like chrX-PAR1, etc.)

If you're not using the file with LocusZoom, though, you could split the chrX positions back out into chrX, PAR1, PAR2 files like was done originally. Or modify the script and write each chromosome as a separate file. The ZIP file plink.GRCh38.map.zip inside the tarball has the original files from the Beagle group.

raonyguimaraes commented 3 years ago

Hey @hyanwong and everyone, I'm also looking for the recombination rates for chrY. If anyone here could point me towards how I could generate that, I would really appreciate.

Regards,

MarchOnion commented 2 years ago

These are the recombination rates we are using for hg38:

http://csg.sph.umich.edu/locuszoom/download/recomb-hg38.tar.gz

You will need to update your database file, see README.md for information on how to do that.

We may at some point release a new bugfix release, at which point this will be included by default.

I have downloaded the data and inserted the file to the database used the command below: python2 bin/dbmeister.py --db data/database/locuszoom_hg38.db --recomb_rate genetic_map_GRCh38_merged.tab

And based on the error message, I modified the header like below: chr pos recomb cm_pos

looks like it works and below are the log message: Creating table recomb_rate.. Loading recomb-hg38/genetic_map_GRCh38_merged.tab into table recomb_rate.. Creating index for table recomb_rate on columns chr,pos..

But when I try to generate a plot under hg38, there's still no recombination rate, even with "showRecomb=TRUE"

Does anyone have any suggestions?

Thanks, Congcong

welchr commented 2 years ago

You could check and see if anything actually got inserted into the table:

echo 'select * from recomb_rate limit 10' | sqlite3 data/database/locuszoom_hg38.db
MarchOnion commented 2 years ago

You could check and see if anything actually got inserted into the table:

echo 'select * from recomb_rate limit 10' | sqlite3 data/database/locuszoom_hg38.db

Thanks for your reply and I did a simple check like below: sqlite> .schema recomb_rate CREATE TABLE recomb_rate ( chr INTEGER, pos INTEGER, recomb FLOAT, cm_pos FLOAT ); CREATE INDEX ind_recomb_rate_chrpos ON recomb_rate (chr,pos);

sqlite> SELECT * FROM recomb_rate limit 10; chr10|48232|0.1614|0.002664 chr10|48486|0.1589|0.002705 chr10|50009|0.159|0.002947 chr10|52147|0.1574|0.003287 chr10|52541|0.1592|0.003349 chr10|64718|0.1611|0.005287 chr10|66015|0.1631|0.005496 chr10|67284|0.1648|0.005703 chr10|67994|0.1658|0.00582 chr10|68368|0.1677|0.005882

MarchOnion commented 2 years ago

I removed the "chr" and it works!! Thanks a lot!

MarchOnion commented 1 year ago

Hi,

I used the cmd like below: python2 bin/dbmeister.py --db data/database/locuszoom_hg38.db --recomb_rate recomb-hg38/genetic_map_GRCh38_merged.tab

On Sun, Sep 25, 2022 at 4:57 AM uki-uiu @.***> wrote:

How did you replace the recombination rates table in the database after removing "chr" from the genetic_map_GRCh38_merged.tab file?

— Reply to this email directly, view it on GitHub https://github.com/statgen/locuszoom-standalone/issues/1#issuecomment-1257150143, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKDS3J3VNTBOHYWRMH42QYTWAAHV7ANCNFSM4DJRXPVA . You are receiving this because you commented.Message ID: @.***>

koujiaodahan commented 2 weeks ago

Hi, I used the cmd like below: python2 bin/dbmeister.py --db data/database/locuszoom_hg38.db --recomb_rate recomb-hg38/genetic_map_GRCh38_merged.tab On Sun, Sep 25, 2022 at 4:57 AM uki-uiu @.> wrote: How did you replace the recombination rates table in the database after removing "chr" from the genetic_map_GRCh38_merged.tab file? — Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKDS3J3VNTBOHYWRMH42QYTWAAHV7ANCNFSM4DJRXPVA . You are receiving this because you commented.Message ID: @.>

Thanks for your sharing!