bcbio / bcbio-nextgen-vm

Run bcbio-nextgen genomic sequencing analyses using isolated containers and virtual machines
MIT License
65 stars 17 forks source link

GRCh37 default build for bcbio: does it include decoys? #169

Open RobertWBaldwin opened 6 years ago

RobertWBaldwin commented 6 years ago

According to website it comes from GATK resource bundle, which comes from 1000GP. Doesn't look to me like it includes decoy fasta. Please let em know. Thanks!

RobertWBaldwin commented 6 years ago

ya I don't see hs37d5 in the dictionary so I guess no decoys in the default build 37

RobertWBaldwin commented 6 years ago

Does anyone know how to use the build that includes the decoy? Do we need to download and pass the build to bcbio in the config file, or is it already there and we can just tell bcbio to use it? What's the best way to do this? Thanks - Robert

O.k I found the "adding custom genomes" section here: https://bcbio-nextgen.readthedocs.io/en/latest/contents/configuration.html#adding-custom-genomes

nevermind I guess, unless there' something else

chapmanb commented 6 years ago

Robert; Thanks for the question. This is all exactly right. GRCh37 and hg19 don't have the decoy fasta. If you need to use them adding a custom genome is the right way to go.

Practically, there were lots of mixed validation about whether decoys help or hurt in build 37 so in the end we decided not to change and add them and are instead encouraging moving to build 38. It provides decoys and alt contigs and is an umabiguous improvement over build 37. Are you in a position to move to hg38 instead of needing to add a custom build 37 genome?

Hope this helps.