bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
994 stars 354 forks source link

0.7.7 provenance reports incorrect version of tools #332

Closed mjafin closed 10 years ago

mjafin commented 10 years ago

We've got Gatk and MuTect in non-standard directories and set up correctly in bcbio_system.yaml. The correct tools get called, but then weird things start happening, such as bcbio thinking gatk 2.3.9, trying to use DepthOfCoverageinstead of Coverage etc. Looking at the provenance/programs.txt I can see:

bcbio-nextgen,0.7.7
bamtools,2.3.0
bcbio_variation,0.1.3
bcftools,0.2.0-rc6
bedtools,2.19.0
biobambam,0.0.116
bowtie2,2.1.0
bwa,0.7.7
cufflinks,2.1.1
cutadapt,1.3
fastqc,0.10.1
freebayes,0.9.13-2
gatk,2.3-9-gdcdccbb
gemini,0.6.1
htseq,0.6.1
mutect,1.1.5
novoalign,3.02.02
novosort,V3.00.02
picard,1.96
platypus-variant,0.5.2
qualimap,0.7.1
rnaseqc,1.1.7
sambamba,0.4.4
samtools,0.1.19
snpeff,3_4
tophat,2.0.9
varscan,2.3.6
vcflib,2014-02-19
vt,2014-02-21
cn.mops,
oncofuse,

Any idea why it's reporting Gatk 2.3.9?

mjafin commented 10 years ago

I have a feeling these are pulled from manifest/custom-packages.yaml and not updated? I commented out the gatk section in the file and now the provenance/programs.txt in my work folder looks as follows:

bcbio-nextgen,0.7.8a-1bf1a8f
alientrimmer,0.3.2
bamtools,2.3.0
bcbio_variation,0.1.3
bcftools,0.2.0-rc6
bedtools,2.19.0
biobambam,0.0.116
bowtie2,2.2.0
bwa,0.7.7
cn.mops,1.8.6
cufflinks,2.1.1
cutadapt,1.3
fastqc,0.10.1
featurecounts,1.4.3-p1
freebayes,0.9.13-2
gemini,0.6.1
htseq,0.5.4p5
mutect,1.1.5
novoalign,3.02.02
novosort,V3.00.02
picard,1.96
platypus-variant,0.5.2
qualimap,0.7.1
rnaseqc,1.1.7
sambamba,0.4.4
samtools,0.1.19
snpeff,3_4
tophat,2.0.9
varscan,2.3.6
vcflib,2014-02-19
vt,2014-02-21
gatk,
oncofuse,

Fingers crossed the run finishes, this seems like an easy fix :)

chapmanb commented 10 years ago

Miika; You're exactly right with the diagnosis. It now pulls from the manifest instead of re-calculating at the start of each run. The idea is to keep the manifest up to date as representative of the system so there is less need to do the custom changing of software. This is part of the general move to Docker where the software will be isolated instead of needing all the bcbio_system.yaml manual configuration.

We need custom ways to update GATK and MuTect with manually downloaded academic or Appistry versions. This would handle updating the manifest and avoid any need to custom install and edit. I'll work on this as a long term fix for the problem so we can have it in place for 0.7.8. Thanks for the heads up.

mjafin commented 10 years ago

Thanks Brad! As an immediate fix, can I just replace 2.3-9-gdcdccbb under gatk with anything > 2.3-9?

chapmanb commented 10 years ago

Miika; That's exactly right -- just replace it with the actual version you're using and you'll be good to go. Thanks for all the patience figuring out how best to deal with these manual downloads.

chapmanb commented 10 years ago

Miika; I provided a new way to add GATK and muTect jars to bcbio so this will hopefully no longer require any manual copying and editing. You should just have to point an upgrade at the jar and it will take care of the rest:

https://bcbio-nextgen.readthedocs.org/en/latest/contents/installation.html#gatk-and-mutect

Let me know if this needs any tweaking at all for the Appistry downloads or if you run into any problems. I'm hoping to get a new release out with all these changes to make this process easier for everyone. Thanks again for all of the helpful feedback.

mjafin commented 10 years ago

Wonderful - I first did an upgrade without the new options (not recognised by the previous updater) and then with the options. However, the manifest file still lists 2.3-9 under gatk and 1.1.5 under mutect (instead of 1.1.6-appistry; that's how I'm calling the jar). Will these be ignore when I'm actually starting a run with either?

Thanks, Miika

chapmanb commented 10 years ago

Miika; There should be a new manifest file (manifest/toolplus-packages.yaml) that has the updated version information. We read that one first so it should recognize the updated versions and be reflected in provenance/programs.txt. It should also update bcbio_system.yaml to point to the new versions so they will be used instead of the default installs. Let me know if anything looks off or you have issues.

mjafin commented 10 years ago

Ah, excellent, can see that now! I had to make a copy of SomaticAnalysisTK.jar into muTect-1.1.6-appistry.jar - a symlink wouldn't do. I'll test it works and then maybe change documentation on how to call the appistry mutect jar to play nicely with bcbio.

mjafin commented 10 years ago

OK, something is going funny here. My Appistry folder contains both GenomeAnalysisTK.jar and muTect-1.1.6-appistry.jar and I pointed the installer to both separately, but the mutect folder (toolplus/mutect/1.1.6-appistry) actualy contains a copy of GenomeAnalysisTK.jar, named as muTect-1.1.6-appistry.jar. Any reason why this might be? I can place the jars in separate folders of course but this isn't likely expected behaviour...?

Edit. Arrghh.. My bad - I'd mixed the jar files myself!