Closed gis-nlsim closed 3 years ago
Hi @gis-nlsim !
Yes, you may just symlink
/oldbcbio/genomes -> /newbcbio/genomes
/odbcbio/galaxy/tool-data /newbcbio/galaxy/tool-data
instead of reinstalling data.
Is it https://github.com/chapmanb/cloudbiolinux/blob/master/ggd-recipes/hg38/dbsnp.yaml
recipe that fails?
We need to fix it then.
Can you show the script in /mnt/projects/XXX/wgs/tools/bcbio/1.2.3/genomes/Hsapiens/hg38/txtmp/ggd-run.sh
Sergey
Thank you for your quick reply, as requested, the ggd-run.sh script is as follows:
[XXX@n111 txtmp]$ cat ggd-run.sh
#!/bin/bash
set -eu -o pipefail
export PATH=/mnt/projects/XXX/wgs/tools/bcbio/1.2.3/tools/bin:$PATH
build=153
version=GCF_000001405.38
url=http://ftp.ncbi.nih.gov/snp/archive/b$build/VCF/$version.gz
remap_url=https://gist.githubusercontent.com/matthdsm/f833aedd2d67e28013ff1d171c70f4ee/raw/442a45ed3ddc6e85c66c5e58e0fa78e16a0821c8/refseq2ucsc.tsv
ref=../seq/hg38.fa
mkdir -p variation
wget -c -O variation/dbsnp-$build-orig.vcf.gz $url
wget -c -O variation/dbsnp-$build-orig.vcf.gz.tbi $url.tbi
[[ -f variation/dbsnp-$build.vcf.gz ]] || bcftools annotate -Ou --rename-chrs $remap_url variation/dbsnp-$build-orig.vcf.gz |\
bcftools sort -m 1G -Oz -T . -o variation/dbsnp-$build.vcf.gz && \
tabix -f -p vcf -C variation/dbsnp-$build.vcf.gz
tabix -f -p vcf variation/dbsnp-$build.vcf.gz
Regards, Ngak Leng
Hi @gis-nlsim !
Sorry about the delay. The files look available:
Maybe it was a temporary Github glitch?
Were you able to run it the recipe? What happens if you run ggd-run.sh
?
Sergey
install.error.txt Sorry, but I have yet to successfully install bcbio. Attached is the entire output from the bcbio installation process. Seek your assistance in this. I have tried installing on 2 different servers (from 2 different institutions), both without success. So I don't think it's an issue with the servers on my side. Thank you.
Hi @gis-nlsim!
Sorry about the continuing issues! I just did two fresh installations of bcbio development instances successfully.
From your log:
Traceback (most recent call last):
File "bcbio_nextgen_install.py", line 290, in <module>
main(parser.parse_args(), sys.argv[1:])
File "bcbio_nextgen_install.py", line 51, in main
subprocess.check_call([bcbio, "upgrade"] + _clean_args(sys_argv, args))
File "/home/projects/13001264/tools/**bcbio/v1.1.9**/anaconda/lib/python3.6/subprocess.py", line 291, in check_call
raise CalledProcessError(retcode, cmd)
Is it possible that you are mixing two bcbio installations? The old bcbio should not be in the PATH when installing the new one!
Sergey
Yes I have an older version that we’re using for production. I’ll remove that from the path and try installing again. Thanks!
some users use modules to maintain several bcbio installations. Also to the issue of reproducibility - having modules bcbio1.2.0, bcbio1.2.1, bcbio 1.2.2 etc and just linking data installation to each of them helps. https://www.admin-magazine.com/HPC/Articles/Environment-Modules
bcbio.error.report.txt Sorry, I've removed the old bcbio paths, now encountering this issue (please see attached file) Thanks for your help.
I think it have hit the network timeout when accessing anaconda.org:
Traceback (most recent call last):
File "/home/users/astar/gis/simngl/tools/anaconda3/lib/python3.7/site-packages/conda/exceptions.py", line 1079, in __call__
return func(*args, **kwargs)
File "/home/projects/13001702/tools/bcbio/v1.2.4/anaconda/lib/python3.7/site-packages/mamba/mamba.py", line 900, in exception_converter
raise e
File "/home/projects/13001702/tools/bcbio/v1.2.4/anaconda/lib/python3.7/site-packages/mamba/mamba.py", line 894, in exception_converter
exit_code = _wrapped_main(*args, **kwargs)
File "/home/projects/13001702/tools/bcbio/v1.2.4/anaconda/lib/python3.7/site-packages/mamba/mamba.py", line 853, in _wrapped_main
result = do_call(args, p)
File "/home/projects/13001702/tools/bcbio/v1.2.4/anaconda/lib/python3.7/site-packages/mamba/mamba.py", line 741, in do_call
exit_code = create(args, parser)
File "/home/projects/13001702/tools/bcbio/v1.2.4/anaconda/lib/python3.7/site-packages/mamba/mamba.py", line 620, in create
return install(args, parser, "create")
File "/home/projects/13001702/tools/bcbio/v1.2.4/anaconda/lib/python3.7/site-packages/mamba/mamba.py", line 570, in install
downloaded = transaction.prompt(PackageCacheData.first_writable().pkgs_dir, repos)
RuntimeError: Download error (28) Timeout was reached [https://repo.anaconda.com/pkgs/main/linux-64/python-3.6.12-hcff3b4d_2.conda]
try from a less busy server or at not a peak usage time? contact network admins for advice?
bcbio.error.install.nopath.txt Hi, I've removed any references to the old bcbio installation, but I'm still getting the same issue (please see attached file). Thanks.
My installation command: python bcbio_nextgen_install.py /home/projects/13001702/tools/bcbio/v1.2.4 --tooldir=/home/projects/13001702/tools/bcbio/v1.2.4 --nodata
Hi!
It is another network timeout this time, so it seems you have sporadic connection issues
RuntimeError: Download error (28) Timeout was reached [https://conda.anaconda.org/bioconda/linux-64/bioconductor-bubbletree-2.6.0-0.tar.bz2]
Try talking you your network's sysadmins. SN
closing for now! Let us know if you are still having installation issues!
Greetings,
I've been attempting to install bcbio quite a few times but it always end up failing when trying to install the genome files. The below is the error I keep getting.
I'm wondering if bcbio will work if I take an existing genomes directory containing the genomes, and modify the .loc files in the galaxy sub-directory? Thank you.
The basic installation works: python ./bcbio_nextgen_install.py /mnt/projects/XXX/wgs/tools/bcbio/1.2.3 --tooldir=/mnt/projects/XXX/wgs/tools/bcbio/1.2.3/tools --aligners bwa
But the error appears when I try to install the genomes: bcbio_nextgen.py upgrade -u skip --genomes GRCh37 --genomes hg38 --genomes mm10
2020-08-18 08:25:19 (246 KB/s) - ‘variation/dbsnp-153-orig.vcf.gz.tbi’ saved [2998587/2998587]
Writing to . Could not read: https://gist.githubusercontent.com/matthdsm/f833aedd2d67e28013ff1d171c70f4ee/raw/442a45ed3ddc6e85c66c5e58e0fa78e16a0821c8/refseq2ucsc.tsv [E::bcf_hdr_read] Input is not detected as bcf or vcf format Could not read VCF/BCF headers from - Cleaning tbx_index_build failed: variation/dbsnp-153.vcf.gz Traceback (most recent call last): File "/mnt/projects/XXX/wgs/tools/bcbio/1.2.3/anaconda/bin/bcbio_nextgen.py", line 228, in
install.upgrade_bcbio(kwargs["args"])
File "/mnt/projects/XXX/wgs/tools/bcbio/1.2.3/anaconda/lib/python3.6/site-packages/bcbio/install.py", line 107, in upgrade_bcbio
upgrade_bcbio_data(args, REMOTES)
File "/mnt/projects/XXX/wgs/tools/bcbio/1.2.3/anaconda/lib/python3.6/site-packages/bcbio/install.py", line 377, in upgrade_bcbio_data
args.cores, ["ggd", "s3", "raw"])
File "/home/XXX/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py", line 354, in install_data_local
_prep_genomes(env, genomes, genome_indexes, ready_approaches, data_filedir)
File "/home/XXX/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py", line 480, in _prep_genomes
retrieve_fn(env, manager, gid, idx)
File "/home/XXX/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py", line 875, in _install_with_ggd
ggd.install_recipe(os.getcwd(), env.system_install, recipe_file, gid)
File "/home/XXX/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/ggd.py", line 30, in install_recipe
recipe["recipe"]["full"]["recipe_type"], system_install)
File "/home/XXX/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/ggd.py", line 62, in _run_recipe
subprocess.check_output(["bash", run_file])
File "/mnt/projects/XXX/wgs/tools/bcbio/1.2.3/anaconda/lib/python3.6/subprocess.py", line 356, in check_output
**kwargs).stdout
File "/mnt/projects/XXX/wgs/tools/bcbio/1.2.3/anaconda/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['bash', '/mnt/projects/XXX/wgs/tools/bcbio/1.2.3/genomes/Hsapiens/hg38/txtmp/ggd-run.sh']' returned non-zero exit status 1.
[XXX@n111 ~]$ bcbio_nextgen.py upgrade -u skip --genomes GRCh37 --genomes hg38 --genomes mm10