bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
986 stars 354 forks source link

problem with run_tests.sh devel #397

Closed bosmont closed 10 years ago

bosmont commented 10 years ago

Hi, I am new to bcbio-nextgen. Please bear with me if my question sounds silly.

I installed bcbio-nextgen on my laptop based on the instruction, and it went quite smoothly. run_test.sh rnaseq was successful (after installed pandas ). But got following error with run_test.sh devel

[Sun Apr 13 20:27:56 EDT 2014] net.sf.picard.sam.BamIndexStats INPUT=/home/mango/work/ngs/bcbio-nextgen-master/tests/test_automated_output/align/Test1/7_100326_FC6107FAAXX-sort.bam VALIDATION_STRINGENCY=SILENT VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false [Sun Apr 13 20:27:56 EDT 2014] Executing as mango@mango-N56JR on Linux 3.13.0-19-generic amd64; OpenJDK 64-Bit Server VM 1.7.0_51-b31; Picard version: 1.96(1510) [Sun Apr 13 20:27:56 EDT 2014] net.sf.picard.sam.BamIndexStats done. Elapsed time: 0.00 minutes. Runtime.totalMemory()=753926144 Traceback (most recent call last): File "/usr/local/bin/bcbio_nextgen.py", line 5, in pkg_resources.run_script('bcbio-nextgen==0.7.9a', 'bcbio_nextgen.py') File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 528, in run_script self.require(requires)[0].run_script(script_name, ns) File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1394, in run_script execfile(script_filename, namespace, namespace) File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/EGG-INFO/scripts/bcbio_nextgen.py", line 62, in main(kwargs) File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/EGG-INFO/scripts/bcbio_nextgen.py", line 40, in main run_main(kwargs) File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/pipeline/main.py", line 40, in run_main fc_dir, run_info_yaml) File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/pipeline/main.py", line 83, in _run_toplevel for xs in pipeline.run(config, config_file, parallel, dirs, pipeline_items): File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/pipeline/main.py", line 312, in run samples = run_parallel("postprocess_alignment", samples) File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/distributed/multi.py", line 28, in run_parallel return run_multicore(fn, items, config, parallel=parallel) File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/distributed/multi.py", line 82, in run_multicore for data in joblib.Parallel(parallel["num_jobs"])(joblib.delayed(fn)(x) for x in items): File "/usr/local/lib/python2.7/dist-packages/joblib-0.8.0a3-py2.7.egg/joblib/parallel.py", line 644, in call self.dispatch(function, args, kwargs) File "/usr/local/lib/python2.7/dist-packages/joblib-0.8.0a3-py2.7.egg/joblib/parallel.py", line 391, in dispatch job = ImmediateApply(func, args, kwargs) File "/usr/local/lib/python2.7/dist-packages/joblib-0.8.0a3-py2.7.egg/joblib/parallel.py", line 129, in init self.results = func(_args, _kwargs) File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/utils.py", line 47, in wrapper return apply(f, _args, _kwargs) File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/distributed/multitasks.py", line 25, in postprocess_alignment return lane.postprocess_alignment(*args) File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/pipeline/lane.py", line 118, in postprocess_alignment data = _recal_no_markduplicates(data) File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/pipeline/lane.py", line 124, in _recal_no_markduplicates data = recalibrate.prep_recal(data)[0][0] File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/variation/recalibrate.py", line 61, in prep_recal platform, dbsnp_file, intervals) File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/variation/recalibrate.py", line 100, in _gatk_base_recalibrator if broad_runner.gatk_type() == "lite": File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/broad/init.py", line 272, in gatk_type if LooseVersion(self.gatk_major_version()) > LooseVersion("2.3"): File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/broad/init.py", line 283, in gatk_major_version full_version = self.get_gatk_version() File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/broad/init.py", line 251, in get_gatk_version self._set_default_versions(self._config) File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/broad/init.py", line 107, in _set_default_versions v = programs.get_version(name, config=config) File "/usr/local/lib/python2.7/dist-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/provenance/programs.py", line 264, in get_version prog, version = line.rstrip().split(",") ValueError: need more than 1 value to unpack ERROR

ERROR: Allow BAM files as input to pipeline.

Traceback (most recent call last): File "/home/mango/work/ngs/bcbio-nextgen-master/tests/test_automated_analysis.py", line 250, in test_5_bam subprocess.check_call(cl) File "/usr/lib/python2.7/subprocess.py", line 540, in check_call raise CalledProcessError(retcode, cmd) CalledProcessError: Command '['bcbio_nextgen.py', '/home/mango/work/ngs/bcbio-nextgen-master/tests/data/automated/post_process-sample.yaml', '/home/mango/work/ngs/bcbio-nextgen-master/tests/data/automated/../100326_FC6107FAAXX', '/home/mango/work/ngs/bcbio-nextgen-master/tests/data/automated/run_info-bam.yaml']' returned non-zero exit status 1


Ran 1 test in 14.640s

FAILED (errors=1)

'

similar error with rut_tests.sh speed=2

Is there anything wrong with my PICARD, or anything else?

Thanks

chapmanb commented 10 years ago

Thanks much for the error report and sorry about the issues. It looks like something like problematic with your GATK installation. Within the test directory, what does test_automated_output/provenance/programs.txt look like? It appears as if something is wrong with one of the lines there. It should be a pretty straightforward CSV file of program_name,version.

More generally, how did you install bcbio? From needing to install pandas manually, it sounds like you might not have used the automated installer (https://bcbio-nextgen.readthedocs.org/en/latest/contents/installation.html#isolated-installations). That should be the cleanest approach to ensure you have the libraries and third party tools installed correctly. If you ran into install issues there happy to help with that.

Hope one of these gets you going.

bosmont commented 10 years ago

Thanks for the quick response.

Here is my programs.txt:

bcbio-nextgen,0.7.9a htseq,0.6.0 bamtofastq,0.0.125 bamtools,2.3.0 bcftools,0.2.0-rc6 bedtools,2.19.1 bowtie2,2.1.0 bwa,0.7.8 cufflinks,v2.1.1 cutadapt,/usr/lib/python2.7/dist-packages/pkg_resources.py:1031: UserWarning: /home/mango/.python-eggs is writable by group/others and vulnerable to attack when used with get_resource_filename. Consider a more secure location (set with .set_extraction_path or the PYTHON_EGG_CACHE environment variable). warnings.warn(msg, UserWarning) 1.4.2 fastqc,0.10.1 freebayes,0.9.13-2 gemini,0.6.5 novosort,V1.02.02 novoalign,3.02.02 samtools,0.1.19 sambamba,0.4.6-beta qualimap,0.7.1 tophat,v2.0.9 vcflib,2014-02-19 featurecounts, bcbio_variation,0.1.5 gatk,2.3-9-gdcdccbb mutect,1.1.5 picard,1.96 rnaseqc,_v1.1.7 snpeff,3.4i varscan,v2.3.6 oncofuse, alientrimmer,

For installation, I installed using the automated installer, and it completed without error.

python bcbio_nextgen_install.py /usr/local/share/bcbio-nextgen --tooldir=/usr/local

Bur somehow, it complains about problem with pandas when I run test for rnaseq. rnaseq test was successful after manually installed pandas. Are you suggesting using the isolated installation instead? eg. specify my own path for data and tools?

Thanks,

On Sun, Apr 13, 2014 at 9:38 PM, Brad Chapman notifications@github.comwrote:

Thanks much for the error report and sorry about the issues. It looks like something like problematic with your GATK installation. Within the test directory, what does test_automated_output/provenance/programs.txt look like? It appears as if something is wrong with one of the lines there. It should be a pretty straightforward CSV file of program_name,version.

More generally, how did you install bcbio? From needing to install pandas manually, it sounds like you might not have used the automated installer ( https://bcbio-nextgen.readthedocs.org/en/latest/contents/installation.html#isolated-installations). That should be the cleanest approach to ensure you have the libraries and third party tools installed correctly. If you ran into install issues there happy to help with that.

Hope one of these gets you going.

Reply to this email directly or view it on GitHubhttps://github.com/chapmanb/bcbio-nextgen/issues/397#issuecomment-40327181 .

chapmanb commented 10 years ago

Thanks for the additional information. The cause of your error is the big warning message in the middle of the programs.txt file. I pushed a fix that would avoid this so if you update with bcbio_nextgen.py upgrade -u development it should hopefully work cleanly now.

However, there do appear to be a couple of clues that not everything went smoothly with the install. The way you did it looks totally fine, but missing pandas is an indication something is not right since that should be installed by conda. You ideally should see something like:

$ /usr/local/share/bcbio-nextgen/anaconda/bin/conda list | grep pandas
pandas                    0.13.1               np18py27_0 

The second clue is triggering the program checks at all. This should ideally be pulled in via the manifest. Do you have a manifest directory?

$ ls -lh /usr/local/share/bcbio-nextgen/manifest/
total 536K
-rw-rw-r-- 1 chapmanb chapmanb 1.7K Mar 22 12:02 brew-packages.yaml
-rw-rw-r-- 1 chapmanb chapmanb  14K Mar 22 12:02 custom-packages.yaml
-rw-rw-r-- 1 chapmanb chapmanb 375K Mar 22 12:02 debian-base-packages.yaml
-rw-rw-r-- 1 chapmanb chapmanb 119K Mar 22 12:02 debian-packages.yaml
-rw-rw-r-- 1 chapmanb chapmanb 4.3K Mar 22 12:02 python-packages.yaml
-rw-rw-r-- 1 chapmanb chapmanb 7.5K Mar 22 12:02 r-packages.yaml
-rw-rw-r-- 1 chapmanb chapmanb   45 Mar 22 12:02 toolplus-packages.yaml

If things work after the fix no worries, but just some additional bits to look at if you still run into problems. Thanks for the patience getting this running.

bosmont commented 10 years ago

all tests are successful after installing your fix. Thanks so much.