Closed tanglingfung closed 11 years ago
Paul; I try to avoid blocks of version specific code like this, so there isn't a specific place to look for 2.7 tweaks. The approach is to check based on version and apply parameters as needed. Here's an example in the filtering code:
https://github.com/chapmanb/bcbio-nextgen/blob/master/bcbio/variation/genotype.py#L269
For adding in the new reference confidence calls, you'd want to add a check here:
https://github.com/chapmanb/bcbio-nextgen/blob/master/bcbio/variation/genotype.py#L97
(version >= 2.7 and len(align_bams) == 1)
Thanks for looking at this.
I tried to link the gatk2.7 with bcbio-nextgen by changing the soft link under
/storage/sam/anaconda/lib/python2.7/site-packages/bcbio/provenance/programs.pyc in get_version(name='picard', dirs=None, config={'algorithm': {'aligner': 'bwa', 'coverage_depth': 'high', 'coverage_interval': 'genome', 'mark_duplicates': 'picard', 'max_errors': 2, 'memory_adjust': {'direction': 'decrease', 'magnitude': 2}, 'num_cores': 1, 'platform': 'illumina', 'quality_format': 'Standard', 'realign': 'gatk', ...}, 'custom_algorithms': {'Minimal': {'aligner': ''}, 'RNA-seq': {'aligner': 'tophat', 'transcript_assemble': True}, 'variant2': {'aligner': 'bwa', 'coverage_depth': 'high', 'coverage_interval': 'exome', 'recalibrate': 'gatk', 'variantcaller': 'gatk'}}, 'log_dir': '/storage/sam/Data/log', 'resources': {'bcbio_variation': {'dir': '/storage/sam/share/java/bcbio_variation', 'jvm_opts': ['-Xms750m', '-Xmx2500m']}, 'bwa': {'cmd': 'bwa', 'cores': 16}, 'freebayes': {'memory': '2g'}, 'gatk': {'dir': '/storage/sam/share/java/gatk', 'jvm_opts': ['-Xms750m', '-Xmx2500m']}, 'gatk-haplotype': {'jvm_opts': ['-Xms2g', '-Xmx5500m']}, 'gatk-vqsr': {'jvm_opts': ['-Xms2g', '-Xmx4000m']}, 'gemini': {'cores': 16}, 'log': {'dir': 'log'}, 'novoalign': {'cores': 16, 'memory': '2G'}, 'picard': {'dir': '/storage/sam/share/java/picard'}, ...}}) 137 p = _get_program_file(dirs) 138 else: 139 p = config["resources"]["program_versions"] 140 with open(p) as in_handle: 141 for line in in_handle: --> 142 prog, version = line.rstrip().split(",") 143 if prog == name and version: 144 return version 145 raise KeyError("Version information not found for %s in %s" % (name, p)) 146
ValueError: need more than 1 value to unpack
Did I did something wrong with my gatk installation or was bcbio not supporting the latest gatk? (I have also checked java -version to be 1.7 as per requested by gatk...) Thank you for your helps
Thanks much for the report. The symlink approach you described should work, but it seems like something is wrong with your program version file. Could you post a Gist (https://gist.github.com/) of your provenance/programs.txt
in the working directory?
Also running bcbio_nextgen.py
with a single core (-n 1
) during testing will produce less verbose error output and might make the issue easier to spot.
Hi, I have made the Gist:
https://gist.github.com/choishingwan/6886050
(Sorry about the format, I am new to Gist...)
When inspecting the content, I remotely remember seeing similar error when using the GATK Queue with the Could not find the main class: org.broadinstitute.sting.gatk.CommandLineGATK Error. However, when using GATK 2.6, I would observe such problem. I can even finish full Queue run based on the best practice from GATK
Thank you for your help
The error comes from using a pre-1.7 Java version running GATK 2.7 (or 2.6):
http://stackoverflow.com/questions/10382929/unsupported-major-minor-version-51-0
I know you mention installing 1.7 but it looks like the pipeline is picking up 1.6 instead which causes the issue. I added a check to bcbio-nextgen which should make the origin of the java used more clear in case it is a PATH issue. You can upgrade with bcbio_nextgen.py upgrade -u development
or double check the current version on your PATH. Hope this helps.
I see I have now solved the problem with some googling.
So what happened with our server is that the original java (java 1.6) was installed at /usr/bin/java whereas our java 1.7 was installed at /software/java-7/jre1.7.0_25/bin/java. In order for us to use the bcbio_nextgen with the latest GATK, we will need to set the JAVA_HOME and also the PATH, pointing at the new java location and that will allow us to run the pipeline without the error message.
Thank you.
sorry, I cannot find specific code for supporting GATK2.7, can you tell me where it is? i want to add the support of --emitRefConfidence to HaplotypeCaller, please advice.