bcbio / bcbio-nextgen-vm

Run bcbio-nextgen genomic sequencing analyses using isolated containers and virtual machines
MIT License
65 stars 17 forks source link

valueerror samtools GRCh37 #151

Open firatuyulur opened 8 years ago

firatuyulur commented 8 years ago

Hi, I installed bcbio and its working. I built my code as bcbio_nextgen.py -w template gatk-variant /home/firat/Documents/myfile1.csv /home/firat/Downloads/resources/exampleBAM.bam

This worked. and I was told;

Configuration file created at: /home/firat/myfile1/config/myfile1.yaml Edit to finalize and run with: cd /home/firat/myfile1/work bcbio_nextgen.py ../config/myfile1.yaml

And so i did;

firat@firat-X550CL:~$ cd /home/firat/myfile1/work firat@firat-X550CL:~/myfile1/work$ bcbio_nextgen.py ../config/myfile1.yaml

And here is what happened.

[2016-07-04T16:36Z] System YAML configuration: /usr/local/share/bcbio/galaxy/bcbio_system.yaml [2016-07-04T16:36Z] Resource requests: bwa, sambamba, samtools; memory: 2.00, 2.00; cores: 16, 16, 16 [2016-07-04T16:36Z] Configuring 1 jobs to run, using 1 cores each with 2.00g of memory reserved for each job [2016-07-04T16:36Z] Timing: organize samples [2016-07-04T16:36Z] multiprocessing: organize_samples [2016-07-04T16:36Z] Using input YAML configuration: /home/firat/myfile1/config/myfile1.yaml [2016-07-04T16:36Z] Checking sample YAML configuration: /home/firat/myfile1/config/myfile1.yaml [2016-07-04T16:36Z] Downloading GRCh37 samtools from AWS Traceback (most recent call last): File "/usr/local/bin/bcbio_nextgen.py", line 4, in import('pkg_resources').run_script('bcbio-nextgen==0.9.8', 'bcbio_nextgen.py') File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/setuptools-20.3-py2.7.egg/pkg_resources/init.py", line 726, in run_script

File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/setuptools-20.3-py2.7.egg/pkg_resources/init.py", line 1484, in run_script

File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio_nextgen-0.9.8-py2.7.egg-info/scripts/bcbio_nextgen.py", line 226, in main(kwargs) File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio_nextgen-0.9.8-py2.7.egg-info/scripts/bcbio_nextgen.py", line 43, in main run_main(kwargs) File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 43, in run_main fc_dir, run_info_yaml) File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 87, in _run_toplevel for xs in pipeline(config, run_info_yaml, parallel, dirs, samples): File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 120, in variant2pipeline [x[0]["description"] for x in samples]]]) File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel return run_multicore(fn, items, config, parallel=parallel) File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore for data in joblib.Parallel(parallel["num_jobs"])(joblib.delayed(fn)(x) for x in items): File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 800, in call while self.dispatch_one_batch(iterator): File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 658, in dispatch_one_batch self._dispatch(tasks) File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 566, in _dispatch job = ImmediateComputeBatch(batch) File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 180, in init self.results = batch() File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 72, in call return [func(_args, _kwargs) for func, args, kwargs in self.items] File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/utils.py", line 51, in wrapper return apply(f, _args, _kwargs) File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/distributed/multitasks.py", line 287, in organize_samples return run_info.organize(*args) File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/run_info.py", line 67, in organize item = add_reference_resources(item, remote_retriever) File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/run_info.py", line 158, in add_reference_resources data["reference"] = genome.get_refs(data["genome_build"], aligner, data["dirs"]["galaxy"], data) File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/genome.py", line 207, in get_refs galaxy_config, data) File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/genome.py", line 159, in _get_ref_from_galaxy_loc cur_ref = download_prepped_genome(genome_build, data, name, need_remap) File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/genome.py", line 278, in download_prepped_genome raise ValueError("Could not find reference genome file %s %s" % (genome_build, name)) ValueError: Could not find reference genome file GRCh37 samtools

Whats the reason of having such error?

Thanks already

chapmanb commented 8 years ago

Thanks for trying out bcbio and sorry about the problem. How did you install bcbio? It looks like it's having a problem finding the reference data for GRCh37. Did you include this genome during your install?

http://bcbio-nextgen.readthedocs.io/en/latest/contents/installation.html#automated

Happy to provide more specific advice if it is present in your installation and it's not finding it for some reason. Hope this helps.

firatuyulur commented 8 years ago

Hi again, Thanks for the response. I installed it according to this guide: https://github.com/chapmanb/bcbio-nextgen-vm

Installation

Install bcbio-vm using conda with an isolated Miniconda Python and link to a location on your PATH:

wget http://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh
bash Miniconda-latest-Linux-x86_64.sh -b -p ~/install/bcbio-vm/anaconda
~/install/bcbio-vm/anaconda/bin/conda install --yes -c bioconda bcbio-nextgen-vm
ln -s ~/install/bcbio-vm/anaconda/bin/bcbio_vm.py /usr/local/bin/bcbio_vm.py
ln -s ~/install/bcbio-vm/anaconda/bin/arvados-cwl-runner /usr/local/bin/arvados-cwl-runner
ln -s ~/install/bcbio-vm/anaconda/bin/cwltool /usr/local/bin/cwltool
ln -s ~/install/bcbio-vm/anaconda/bin/conda /usr/local/bin/bcbiovm_conda

would typing " --tooldir=/usr/local \ --genomes GRCh37 --aligners bwa --aligners bowtie2 " be enough to update bcbio to this level?

chapmanb commented 8 years ago

Thanks much for the details. You will need to install the biological data locally if you want to run analyses on your current. bcbio-vm is a bit more complex since it can also drive remote runs. In this case, you'd need to have Docker installed locally and also install biological data by following the rest of the installation instructions in the readme.

Practically, the bcbio-vm Docker setup is still under development so if you want the easiest install and run path your best bet is to use standard bcbio with the automated installer (and include the genomes and aligners you want as above):

http://bcbio-nextgen.readthedocs.io/en/latest/contents/installation.html#automated

Hope one of these paths gets things figured out for you.

firatuyulur commented 8 years ago

Hi again,

the reason why I used the alternative pathway for installation was because of having errors all the time. as I tried the http://bcbio-nextgen.readthedocs.io/en/latest/contents/installation.html#automated website for installation, here is the error I get;

$ python bcbio_nextgen_install.py /usr/local/share/bcbio --tooldir=/usr/local \

--genomes GRCh37 --aligners bwa --aligners bowtie2 Checking required dependencies Installing isolated base python installation Installing bcbio-nextgen --2016-07-05 22:30:06-- https://raw.githubusercontent.com/chapmanb/bcbio-nextgen/master/requirements-conda.txt Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.12.133 Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.12.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 20 [text/plain] requirements-conda.txt: Permission denied

Cannot write to ‘requirements-conda.txt’ (Success). Traceback (most recent call last): File "bcbio_nextgen_install.py", line 245, in main(parser.parse_args(), sys.argv[1:]) File "bcbio_nextgen_install.py", line 38, in main bcbio = install_conda_pkgs(anaconda) File "bcbio_nextgen_install.py", line 70, in install_conda_pkgs subprocess.check_call(["wget", "--no-check-certificate", REMOTES["requirements"]]) File "/usr/lib/python2.7/subprocess.py", line 541, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['wget', '--no-check-certificate', 'https://raw.githubusercontent.com/chapmanb/bcbio-nextgen/master/requirements-conda.txt']' returned non-zero exit status 3

It recommends not to use sudo in the text. what should I do?

Thanks.

roryk commented 8 years ago

It looks like you don't have write permission to /usr/local/, so you can't install it there. If you can set tooldir to something you have write permission to you should be all set.