hartwigmedical / hmftools

Various algorithms for analysing genomics data
GNU General Public License v3.0
179 stars 56 forks source link

Advice needed in regards to the `gridss_blacklist` within `hmf_dna_pipeline_resources.38_v5.34.tar.gz` #573

Closed Teezi closed 2 days ago

Teezi commented 2 days ago

Hi,

As gridss recommended that I use ENCFF356LFX.bed file for hg38 (UCSC chr notation) as the blacklist (exclude_list).

Upon opening the hmf_dna_pipeline_resources.38_v5.34.tar.gz (which was downloaded from HMFTools-Resources in the Hartwig Medical Foundation's GCP bucket) I noticed another 2 blacklists: gridss_blacklist.38.bed.gz and sv_prep_blacklist.38.bed. Should I be using one of these lists instead of the ENCFF356LFX.bed and if so, which of the 2 files should I be using?

charlesshale commented 2 days ago

For Gridss use the grid's_blacklist BED file. If you're running SvPrep first (as we recommend), use the sv_prep_blacklist BED for that too.

charlesshale commented 7 hours ago

I recall they are the same file, but if not use the gridss_blacklist one