Closed mortunco closed 8 years ago
Tunc; Sorry about the problems. This is a different problem than you saw previously and the informative error is:
[E::hts_open_format] fail to open file '../variation/dbsnp_138.vcf.gz'
For some reason /path/to/bcbio/genomes/Hsapiens/GRCh37/dbsnp_138.vcf.gz
is missing from your install. I'm not sure if the install failed or this file was deleted. You could try re-running the data install:
bcbio_nextgen.py upgrade --data
to download and install it. Hope this helps.
Brad;
Thank you for the help. You were right, my dbsnp_138 were missing. In the docker version, the instillation was finalises successfully most of the time. What should be my method to check if I installed bcbio correct? Is there any method that I can do other than "run tests" in the documentation? Is that method valid ?
Dear Brad;
I am getting this error about mutect directory ? (based on the most recent one). I believe I understand my problem this time but I may need your help about the solution. I started the download by specifying a location of GATK and mutect 1.1.7. but, I also specified their locations with bcbio_nextgen.py upgrade --tool --toolsplus mutect=/path/to/mutectANDgatk/jars
and I think it created a collision. But even though, command-outed resources:
lines in the configuration, it does not work and give the same error that I have obtained before.
Also, upgrade option of bcbio might have problems because it responds well to the GATK.jar but not to mutect.1.1.7 jar?? Is this an expected behavior ? Should mutect.jar location be updated when I run bcbio_nextgen.py --upgrade command ?
For possible anwers: I have run bcbio_nextgen.py upgrade tools and data a lot of times.
My log files dont contain this error. This is the stdout of the bcbio_nextgen.py process. Sorry I had to share with attachment otherwise, I got an error related exceeding maximum character limit. bcbio-nextgen.log.txt
This is the configuration that I edit? GATK path is right but mutect stays unchanged.
[ec2-user@ip-172-31-55-174 ~]$ cat /usr/local/share/bcbio/galaxy/bcbio_system.yaml
galaxy_config: universe_wsgi.ini
resources:
bwa:
cmd: bwa
cores: 16
cufflinks:
cores: 16
memory: 3g
default:
cores: 16
jvm_opts:
- -Xms750m
- -Xmx2000m
memory: 2G
dexseq:
memory: 10g
express:
memory: 8g
gatk:
dir: /usr/local/share/bcbio/toolplus/gatk/3.5-0-g36282e4
jvm_opts:
- -Xms500m
- -Xmx3500m
hisat2:
cores: 16
memory: 2G
macs2:
cores: 1
memory: 8g
miraligner:
jvm_opts:
- -Xms750m
- -Xmx4500m
oncofuse:
jvm_opts:
- -Xms750m
- -Xmx2000m
picard:
jvm_opts:
- -Xms750m
- -Xmx3500m
qualimap:
memory: 4g
sailfish:
cores: 16
memory: 1g
samtools:
cores: 16
memory: 2G
seqcluster:
memory: 8g
snap:
cores: 16
memory: 4G
snpeff:
jvm_opts:
- -Xms750m
- -Xmx6g
star:
cores: 16
memory: 2g
stringtie:
cores: 16
memory: 1g
vardict:
jvm_opts:
- -Xms750m
- -Xmx3000m
wham:
memory: 3500m
Tunc; For the install testing, you'll need to run a real pipeline to evaluate the install. The tests use a minimal genome directory because running against a full genome is too intensive. We rely on identifying errors during install as the best way to identify if everything worked correctly.
Regarding the MuTect problem, it doesn't look the install command worked correctly as I don't see a mutect section in your input file. MuTect is a separate jar from GATK. What command exactly did you run to install it? From your example above you want to point at the jar files, not directories with jar files:
http://bcbio-nextgen.readthedocs.org/en/latest/contents/installation.html#gatk-and-mutect-mutect2
Hope this helps.
Brad;
I initiated a cancer-variant example to check if the system is ok.
[ec2-user@ip-172-31-55-174 ~]$ ls
bcbio_nextgen_install.py GATK puppy tmp
[ec2-user@ip-172-31-55-174 ~]$ ls GATK/
GenomeAnalysisTK.jar mutect-1.1.7.jar
I used the following command as stated in the documentation.
bcbio_nextgen.py upgrade --tools --toolplus mutect=/home/ec2-user/GATK/mutect-1.1.7.jar
bcbio_nextgen.py upgrade --tools --toolplus gatk=/home/ec2-user/GATK/GenomeAnalysisTK.jar
Thank you,
T.
Tunc;
Thanks much for the details, this helps a lot. Apologies, this was a bug in installing these custom jars -- if the mutect
block was not already present in the original configuration it would fail to add the new installed directory. If you update to the latest development and retry it should now work correctly:
bcbio_nexgen.py upgrade -u development
bcbio_nextgen.py upgrade --tools --toolplus mutect=/home/ec2-user/GATK/mutect-1.1.7.jar
Thanks much for the report and hope this gets your analysis running.
Brad;
Thank you very much for the patience. Now, it solved my problem. You made me the happiest man on earth.
Little question;
Since you released it as development, will I have to download this specific option while installing to our HPC in my university? or Can we go with the method?
Thank you very very much ! Thank you thank you
Best, T.
wget https://raw.github.com/chapmanb/bcbio-nextgen/master/scripts/bcbio_nextgen_install.py
python bcbio_nextgen_install.py /usr/local/share/bcbio --tooldir=/usr/local \
--genomes GRCh37 --aligners bwa --aligners bowtie2
Tunc;
Glad to help. You will be to add -u development
to installs or updates to get this currently. We'll plan to have a new release with these fixes soon. Hope this helps.
Hi,
I am really confused the fact that my run gets failed. I ran exactly two configuration files(the only change was the path's of the files), one in bcbio_nextgen, one in bcbio-vm. and I got interruped. In bcbio-nextgen I got the same error that I got before (https://github.com/AstraZeneca-NGS/VarDictJava/issues/34)( @mjafin ). But I dont get why it causes error since bcbio vm proved that it can run successfully without realigning, cleaning and sorting.
I am willing to ANYTHING to solve this issue and have reproducible runs.
Thank you for your time and patience (again).
Best,
Tunc.
bcbio_nextgen.py configuration file
bcbio_vm.py. This configuration finalised successfully.
The error that I got from the last run.
Also, I did all the options related to re adjustment false ( realign, bam_clean, bam sort, mark_duplicates etc.. ) but I still see that bammarkduplicates command in the bcbio-nextgen-comands.log. Am I supposed to see that or is it irrelevant and it is required as the part of variant calling.