chapmanb / bcbio.variation

Toolkit to analyze genomic variation data, built on the GATK with Clojure
66 stars 15 forks source link

bcbio variation files #10

Closed davidhoover closed 11 years ago

davidhoover commented 11 years ago

I ran the first example pipeline, and it failed because I hadn't migrated the variation database into my galaxy installation. I can do that, but it brings up a problem. The variation files (dbsnp_137.vcf, hapmap_3.3.vcf, etc.) are for GRCh37. I can't find anything in the bcbio_system.yaml to distinguish a set of variation files from individual genomes/builds. They are all lumped into one key, bcbio_variation.

Does bcbio have the capacity to select variation files based on genome/build? If so, how should I configure the bcbio_system.yaml file?

chapmanb commented 11 years ago

David; There isn't yet support for other organisms and builds. I plan to have a look at this soon-ish with GRCh38 out soon and we also want to support GRCm38 mouse calling here. What species are you interested in calling variants for?

davidhoover commented 11 years ago

hg18, hg19, mm8, mm9, mm10. Well, whatever our users ask for. Most likely hg19 and mm10. There should be a way of directing where to look, rather than hard wiring a single file.

On Jul 26, 2013, at 12:05 PM, Brad Chapman wrote:

David; There isn't yet support for other organisms and builds. I plan to have a look at this soon-ish with GRCh38 out soon and we also want to support GRCm38 mouse calling here. What species are you interested in calling variants for?

— Reply to this email directly or view it on GitHub.

chapmanb commented 11 years ago

David; Thanks again for these ideas. The latest version of the development code implements this with genome specific resource files. You now specify dbSNP and associated genome files by organism, instead of globally, which enables support for multiple organisms and builds:

https://bcbio-nextgen.readthedocs.org/en/latest/contents/configuration.html#genome-configuration-files

We'll work on adding in automated installation for non-human resources to enable variant calling there as well, but the architecture is now in place. Thanks again.