mskilab-org / fragCounter

GC and mappability corrected fragment coverage for paired end whole genome sequencing
MIT License
7 stars 11 forks source link

Do you have hg38 versions of the mappability and gc rds files? #1

Closed pwaltman closed 3 years ago

pwaltman commented 5 years ago

If I understand correctly, it appears that you are using a mappability index that is from the ENCODE project (I could be wrong - it's been a while since I worked on this), but ENCODE only has one for hg19/grch37, if memory serves.

For that reason, I followed the instructions on this link (https://wiki.bits.vib.be/index.php/Create_a_mappability_track) to create my own mappability track. I also used these instructions on the same site (https://wiki.bits.vib.be/index.php/Create_a_GC_content_track) to create my own gc content rds file. (I'm attaching the source of the script I used to generate my versions)

However, I'm getting mixed results with JaBbA when I use these (too many recurrent CNV deletions to be believable, e.g. 4 out of 4 samples having 2 or more same/similar deletions in multiple locations per chromosome). Since I'm not sure what might cause this, I'm trying to eliminate potential sources of systematic error, and wanted to see if you had hg38 versions of the gc and mappability rds files that fragCounter requires.

The script is mostly complete, but the hardest part was finding the right version of the GEM utilties. While there's a newer version available on github, my script (and the page that it was based on) used the version available from: https://sourceforge.net/projects/gemlibrary/files/gem-library/Binary%20pre-release%203/.

gen_gc_mappability_bw.txt

jinghu23 commented 3 years ago

If I understand correctly, it appears that you are using a mappability index that is from the ENCODE project (I could be wrong - it's been a while since I worked on this), but ENCODE only has one for hg19/grch37, if memory serves.

For that reason, I followed the instructions on this link (https://wiki.bits.vib.be/index.php/Create_a_mappability_track) to create my own mappability track. I also used these instructions on the same site (https://wiki.bits.vib.be/index.php/Create_a_GC_content_track) to create my own gc content rds file. (I'm attaching the source of the script I used to generate my versions)

However, I'm getting mixed results with JaBbA when I use these (too many recurrent CNV deletions to be believable, e.g. 4 out of 4 samples having 2 or more same/similar deletions in multiple locations per chromosome). Since I'm not sure what might cause this, I'm trying to eliminate potential sources of systematic error, and wanted to see if you had hg38 versions of the gc and mappability rds files that fragCounter requires.

The script is mostly complete, but the hardest part was finding the right version of the GEM utilties. While there's a newer version available on github, my script (and the page that it was based on) used the version available from: https://sourceforge.net/projects/gemlibrary/files/gem-library/Binary%20pre-release%203/.

gen_gc_mappability_bw.txt Hi, I followed ur script and obtained the final bw file. Anyway, are thes two files the rds file we needed for fragCounter? Thx

pwaltman commented 3 years ago

If I understand correctly, it appears that you are using a mappability index that is from the ENCODE project (I could be wrong - it's been a while since I worked on this), but ENCODE only has one for hg19/grch37, if memory serves. For that reason, I followed the instructions on this link (https://wiki.bits.vib.be/index.php/Create_a_mappability_track) to create my own mappability track. I also used these instructions on the same site (https://wiki.bits.vib.be/index.php/Create_a_GC_content_track) to create my own gc content rds file. (I'm attaching the source of the script I used to generate my versions) However, I'm getting mixed results with JaBbA when I use these (too many recurrent CNV deletions to be believable, e.g. 4 out of 4 samples having 2 or more same/similar deletions in multiple locations per chromosome). Since I'm not sure what might cause this, I'm trying to eliminate potential sources of systematic error, and wanted to see if you had hg38 versions of the gc and mappability rds files that fragCounter requires. The script is mostly complete, but the hardest part was finding the right version of the GEM utilties. While there's a newer version available on github, my script (and the page that it was based on) used the version available from: https://sourceforge.net/projects/gemlibrary/files/gem-library/Binary%20pre-release%203/. gen_gc_mappability_bw.txt Hi, I followed ur script and obtained the final bw file. Anyway, are thes two files the rds file we needed for fragCounter? Thx

Honestly, it's been over 18 months since I worked on this. Reviewing my code, it appears that it just generates the .bedgraph file, which you will need to convert to the rds file. Good luck.

ShaiberAlon commented 3 years ago

Hi,

I know it's been a while, but just wanted to put a note for anyone who might look for this going forward, that we now have GC and mapability files available here: https://github.com/ShaiberAlon/fragCounter/raw/master/inst/extdata/gcmap.hg38.200bp.tar.gz

soymintc commented 1 year ago

Thanks @ShaiberAlon really appreciate it!