bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
994 stars 354 forks source link

Installing latest stable version fails with --nodata #3223

Closed amizeranschi closed 4 years ago

amizeranschi commented 4 years ago

I'm trying to install the latest stable version and it's failing, something to do with a badly formatted YAML file. Here's what I'm running:

bcbio_path=/export/home/ncit/external/a.mizeranschi/bcbio_nextgen
mkdir ${bcbio_path}
cd ${bcbio_path}
wget https://raw.github.com/chapmanb/bcbio-nextgen/master/scripts/bcbio_nextgen_install.py
python bcbio_nextgen_install.py ${bcbio_path} -u stable --tooldir=${bcbio_path}/tools --nodata --isolate

and this is the result:

Downloading and Extracting Packages
python-3.6.10        | 34.1 MB   | ######################################################################################################################################################################## | 100% 
certifi-2020.4.5.1   | 151 KB    | ######################################################################################################################################################################## | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Installing data and third party dependencies
Upgrading bcbio
Traceback (most recent call last):
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/bin/bcbio_nextgen.py", line 228, in <module>
    install.upgrade_bcbio(kwargs["args"])
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/install.py", line 70, in upgrade_bcbio
    _update_conda_packages()
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/install.py", line 285, in _update_conda_packages
    channels = _get_conda_channels(conda_bin)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/install.py", line 268, in _get_conda_channels
    config = yaml.safe_load(subprocess.check_output([conda_bin, "config", "--show"]))
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/yaml/__init__.py", line 162, in safe_load
    return load(stream, SafeLoader)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/yaml/__init__.py", line 114, in load
    return loader.get_single_data()
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/yaml/constructor.py", line 49, in get_single_data
    node = self.get_single_node()
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/yaml/composer.py", line 36, in get_single_node
    document = self.compose_document()
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/yaml/composer.py", line 58, in compose_document
    self.get_event()
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/yaml/parser.py", line 118, in get_event
    self.current_event = self.state()
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/yaml/parser.py", line 193, in parse_document_end
    token = self.peek_token()
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/yaml/scanner.py", line 129, in peek_token
    self.fetch_more_tokens()
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/yaml/scanner.py", line 223, in fetch_more_tokens
    return self.fetch_value()
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/yaml/scanner.py", line 579, in fetch_value
    self.get_mark())
yaml.scanner.ScannerError: mapping values are not allowed here
  in "<byte string>", line 19, column 15:
            GitHub:  https://github.com/QuantStack ... 
                  ^
Traceback (most recent call last):
  File "bcbio_nextgen_install.py", line 290, in <module>
    main(parser.parse_args(), sys.argv[1:])
  File "bcbio_nextgen_install.py", line 52, in main
    subprocess.check_call([bcbio, "upgrade"] + _clean_args(sys_argv, args))
  File "/usr/lib64/python3.6/subprocess.py", line 311, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/bin/bcbio_nextgen.py', 'upgrade', '-u', 'stable', '--tooldir=/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/tools', '--isolate']' returned non-zero exit status 1.
naumenko-sa commented 4 years ago

Hi @amizeranschi !

Sorry about the issue. It seems we have solved it recently in the development: https://github.com/bcbio/bcbio-nextgen/issues/3180

Could you please try with

python3 bcbio_nextgen_install.py \
/bcbio -u development \
--tooldir=/bcbio/tools \
--nodata \
--isolate

This

python bcbio_nextgen_install.py /bcbio --tooldir=/bcbio/tools --nodata --isolate

also works to install 1.2.3 stable.

Sergey

amizeranschi commented 4 years ago

Yes, the development version made it through. Thanks for mentioning it. Would be useful to get a new stable release out, though, if the current one has that issue.

However, I ran into another problem now. After installing bcbio with --nodata, I added a custom genome via bcbio_setup_genome.py. This seemed to work fine, but the bcbio_nextgen/galaxy/tool-data directory did not get created. To be more precise, the bcbio_nextgen/galaxy directory only contains the file bcbio_system.yaml.

This renders the custom genome unusable, as bcbio complains about missing loc files:

[2020-05-17T16:21Z] Using input YAML configuration: /export/home/ncit/external/a.mizeranschi/automated-VC-test-BRF/testingVC/config/testingVC.yaml
[2020-05-17T16:21Z] Checking sample YAML configuration: /export/home/ncit/external/a.mizeranschi/automated-VC-test-BRF/testingVC/config/testingVC.yaml
Running bcbio version: 1.2.3
global config: /export/home/ncit/external/a.mizeranschi/automated-VC-test-BRF/testingVC/work/bcbio_system.yaml
run info config: /export/home/ncit/external/a.mizeranschi/automated-VC-test-BRF/testingVC/config/testingVC.yaml
Traceback (most recent call last):
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/tools/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/tools/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 128, in variant2pipeline
    [x[0]["description"] for x in samples]]])
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 1029, in __call__
    if self.dispatch_one_batch(iterator):
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 847, in dispatch_one_batch
    self._dispatch(tasks)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 765, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 206, in apply_async
    result = ImmediateResult(func)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 570, in __init__
    self.results = batch()
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 253, in __call__
    for func, args, kwargs in self.items]
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 253, in <listcomp>
    for func, args, kwargs in self.items]
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 459, in organize_samples
    return run_info.organize(*args)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/pipeline/run_info.py", line 81, in organize
    item = add_reference_resources(item, remote_retriever)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/pipeline/run_info.py", line 177, in add_reference_resources
    data["dirs"]["galaxy"], data)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/pipeline/genome.py", line 233, in get_refs
    galaxy_config, data)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/pipeline/genome.py", line 180, in _get_ref_from_galaxy_loc
    (genome_build, os.path.normpath(loc_file)))
ValueError: Did not find genome build sacCer3_BRF in bcbio installation: /export/home/ncit/external/a.mizeranschi/bcbio_nextgen/galaxy/tool-data/sam_fa_indices.loc
naumenko-sa commented 4 years ago

Glad it worked!

I think since you have not installed any data, then tool-data was not created. Can you try installing with --genomes hg38 --aligners bwa and then install the custom genome? Or just create manually the loc files: https://bcbio-nextgen.readthedocs.io/en/latest/contents/configuration.html#reference-genome-files

amizeranschi commented 4 years ago

OK, I tried installing the hg38 genome. This crashed due to a failed download (http://www.cs.jhu.edu/~genomics/GeneSplicer/GeneSplicer.tar.gz). It looks like that URL isn't valid anymore and it caused an error with a GGD recipe (hg38 genesplicer 2004.04.03).

--2020-05-18 08:18:21--  http://www.cs.jhu.edu/~genomics/GeneSplicer/GeneSplicer.tar.gz
Resolving www.cs.jhu.edu (www.cs.jhu.edu)... 128.220.13.76
Connecting to www.cs.jhu.edu (www.cs.jhu.edu)|128.220.13.76|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2020-05-18 08:18:21 ERROR 404: Not Found.

Upgrading bcbio
Detected 1.2.3 as latest version of bcbio-nextgen on bioconda.
bcbio version 1.2.3 is newer than the conda version 1.2.3, skipping upgrade from conda
Upgrading bcbio-nextgen to latest development version
Upgrade of bcbio-nextgen development code complete.
Upgrading third party tools to latest versions
Reading packages from /export/home/ncit/external/a.mizeranschi/bcbio_nextgen/tmpbcbio-install/cloudbiolinux/contrib/flavor/ngs_pipeline_minimal/packages-conda.yaml
Creating conda environment: python3
Creating conda environment: samtools0
Creating conda environment: dv
Creating conda environment: python2
Creating conda environment: r36
Creating conda environment: htslib1.10
Checking for problematic or migrated packages in default environment
Initalling initial set of packages for default environment with mamba
# Installing into conda environment default: age-metasv, arriba, ataqv, bamtools=2.4.0, bamutil, bbmap, bcbio-prioritize, bcbio-variation, bcbio-variation-recall, bcftools, bedops, bedtools=2.27.1, bio-vcf, biobambam, bowtie, bowtie2, break-point-inspector, bwa, bwakit, cage, cancerit-allelecount, chipseq-greylist, cnvkit, coincbc, cramtools, cufflinks, cyvcf2, deeptools, delly, duphold, ensembl-vep=99.*, express, extract-sv-reads, fastp, fastqc>=0.11.8=1, fgbio, freebayes=1.1.0.46, gatk, gatk4, geneimpacts, genesplicer, gffcompare, goleft, grabix, gridss, gsort, gvcfgenotyper, h5py, hmftools-amber, hmftools-cobalt, hmftools-purple, hmmlearn, hts-nim-tools, htslib, impute2, kallisto>=0.43.1, kraken, ldc>=1.13.0, lofreq, macs2, maxentscan, mbuffer, minimap2, mintmap, mirdeep2=2.0.0.7, mirtop, moreutils, multiqc, multiqc-bcbio, ngs-disambiguate, novoalign, octopus>=0.5.1b, oncofuse, optitype>=1.3.4, parallel, pbgzip, peddy, perl-sanger-cgp-battenberg, picard, pindel, pizzly, pyloh, pysam>=0.14.0, pythonpy, qsignature, qualimap, rapmap, razers3=3.5.0, rtg-tools, sailfish, salmon, sambamba, samblaster, samtools, scalpel, seq2c<2016, seqbuster, seqcluster, seqtk, sickle-trim, simple_sv_annotation, singlecell-barcodes, snap-aligner=1.0dev.97, snpeff=4.3.1t, solvebio, spades, staden_io_lib, star=2.6.1d, stringtie, subread, survivor, tdrmapper, tophat-recondition, trim-galore=0.6.2, ucsc-bedgraphtobigwig, ucsc-bedtobigbed, ucsc-bigbedinfo, ucsc-bigbedsummary, ucsc-bigbedtobed, ucsc-bigwiginfo, ucsc-bigwigsummary, ucsc-bigwigtobedgraph, ucsc-bigwigtowig, ucsc-fatotwobit, ucsc-gtftogenepred, ucsc-liftover, ucsc-wigtobigwig, umis, vardict, vardict-java, variantbam, varscan, vcfanno, vcflib, verifybamid2, viennarna, vqsr_cnn, vt, wham, anaconda-client, awscli, bzip2, ncurses, nodejs, p7zip, readline, s3gof3r, xz, perl-app-cpanminus, perl-archive-extract, perl-archive-zip, perl-bio-db-sam, perl-cgi, perl-dbi, perl-encode-locale, perl-file-fetch, perl-file-sharedir, perl-file-sharedir-install, perl-ipc-system-simple, perl-lwp-protocol-https, perl-lwp-simple, perl-statistics-descriptive, perl-time-hires, perl-vcftools-vcf, bioconductor-annotate, bioconductor-apeglm, bioconductor-biocgenerics, bioconductor-biocinstaller, bioconductor-biocstyle, bioconductor-biostrings, bioconductor-biovizbase, bioconductor-bsgenome.hsapiens.ucsc.hg19, bioconductor-bsgenome.hsapiens.ucsc.hg38, bioconductor-bubbletree, bioconductor-cn.mops, bioconductor-copynumber, bioconductor-degreport, bioconductor-deseq2, bioconductor-dexseq, bioconductor-dnacopy, bioconductor-genomeinfodbdata, bioconductor-genomicranges, bioconductor-iranges, bioconductor-limma, bioconductor-rtracklayer, bioconductor-snpchip, bioconductor-titancna, bioconductor-vsn>=3.50.0, r-base, r-basejump=0.7.2, r-bcbiornaseq>=0.2.7, r-cghflasso, r-chbutils, r-devtools, r-dplyr, r-dt, r-ggdendro, r-ggplot2, r-ggrepel>=0.7, r-gplots, r-gsalib, r-knitr, r-pheatmap, r-plyr, r-pscbs, r-reshape, r-rmarkdown, r-rsqlite, r-sleuth, r-snow, r-stringi, r-viridis>=0.5, r-wasabi, r=3.5.1, xorg-libxt
# Installing into conda environment dv: deepvariant
# Installing into conda environment htslib1.10: mosdepth
# Installing into conda environment python2: bismark=0.22.1, cpat, cutadapt=1.16, dkfz-bias-filter, gemini, gvcf-regions, hap.py, hisat2, htseq=0.9.1, lumpy-sv, manta, metasv, mirge, phylowgs, platypus-variant, sentieon, smcounter2, smoove, strelka, svtools, svtyper, theta2, tophat, vawk, vcf2db
# Installing into conda environment python3: atropos, crossmap
# Installing into conda environment r36: bioconductor-purecn>=1.16.0
# Installing into conda environment samtools0: ericscript
Creating manifest of installed packages in /export/home/ncit/external/a.mizeranschi/bcbio_nextgen/manifest
Third party tools upgrade complete.
Upgrading bcbio-nextgen data files
List of genomes to get (from the config file at '{'genomes': [{'dbkey': 'hg38', 'name': 'Human (hg38) full', 'indexes': ['seq', 'twobit', 'bwa', 'hisat2'], 'annotations': ['ccds', 'capture_regions', 'coverage', 'prioritize', 'dbsnp', 'hapmap_snps', '1000g_omni_snps', 'ACMG56_genes', '1000g_snps', 'mills_indels', '1000g_indels', 'clinvar', 'qsignature', 'genesplicer', 'effects_transcripts', 'varpon', 'vcfanno', 'viral', 'transcripts', 'RADAR', 'rmsk', 'salmon-decoys', 'fusion-blacklist', 'mirbase'], 'validation': ['giab-NA12878', 'giab-NA24385', 'giab-NA24631', 'platinum-genome-NA12878', 'giab-NA12878-remap', 'giab-NA12878-crossmap', 'dream-syn4-crossmap', 'dream-syn3-crossmap', 'giab-NA12878-NA24385-somatic', 'giab-NA24143', 'giab-NA24149', 'giab-NA24694', 'giab-NA24695']}], 'genome_indexes': ['bwa', 'bowtie2', 'hisat2', 'rtg'], 'install_liftover': False, 'install_uniref': False}'): Human (hg38) full
Running GGD recipe: hg38 seq 1000g-20150219_1
Running GGD recipe: hg38 bwa 1000g-20150219
Moving on to next genome prep method after trying ggd
GGD recipe not available for hg38 bowtie2
Downloading genome from s3: hg38 bowtie2
Moving on to next genome prep method after trying s3
No pre-computed indices for hg38 bowtie2
Preparing genome hg38 with index bowtie2
Running GGD recipe: hg38 hisat2 12-07-2015
Moving on to next genome prep method after trying ggd
GGD recipe not available for hg38 rtg
Downloading genome from s3: hg38 rtg
Moving on to next genome prep method after trying s3
No pre-computed indices for hg38 rtg
Preparing genome hg38 with index rtg
Running GGD recipe: hg38 ccds r20
Running GGD recipe: hg38 capture_regions 20161202
Running GGD recipe: hg38 coverage 2018-10-16
Running GGD recipe: hg38 prioritize 20181227
Running GGD recipe: hg38 dbsnp 153-20180725
Running GGD recipe: hg38 hapmap_snps 20160105
Running GGD recipe: hg38 1000g_omni_snps 20160105
Running GGD recipe: hg38 ACMG56_genes 20160726
Running GGD recipe: hg38 1000g_snps 20160105
Running GGD recipe: hg38 mills_indels 20160105
Running GGD recipe: hg38 1000g_indels 2.8_hg38_20150522
Running GGD recipe: hg38 clinvar 20190513
Running GGD recipe: hg38 qsignature 20160526
Running GGD recipe: hg38 genesplicer 2004.04.03
Traceback (most recent call last):
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/bin/bcbio_nextgen.py", line 228, in <module>
    install.upgrade_bcbio(kwargs["args"])
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/install.py", line 107, in upgrade_bcbio
    upgrade_bcbio_data(args, REMOTES)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/site-packages/bcbio/install.py", line 359, in upgrade_bcbio_data
    args.cores, ["ggd", "s3", "raw"])
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py", line 354, in install_data_local
    _prep_genomes(env, genomes, genome_indexes, ready_approaches, data_filedir)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py", line 480, in _prep_genomes
    retrieve_fn(env, manager, gid, idx)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py", line 875, in _install_with_ggd
    ggd.install_recipe(os.getcwd(), env.system_install, recipe_file, gid)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/ggd.py", line 30, in install_recipe
    recipe["recipe"]["full"]["recipe_type"], system_install)
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/ggd.py", line 62, in _run_recipe
    subprocess.check_output(["bash", run_file])
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['bash', '/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/genomes/Hsapiens/hg38/txtmp/ggd-run.sh']' returned non-zero exit status 8.
Checking required dependencies
Installing isolated base python installation
Installing mamba
Installing conda-build
Installing bcbio-nextgen
Installing data and third party dependencies
Traceback (most recent call last):
  File "bcbio_nextgen_install.py", line 290, in <module>
    main(parser.parse_args(), sys.argv[1:])
  File "bcbio_nextgen_install.py", line 52, in main
    subprocess.check_call([bcbio, "upgrade"] + _clean_args(sys_argv, args))
  File "/usr/lib64/python3.6/subprocess.py", line 311, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/bin/bcbio_nextgen.py', 'upgrade', '-u', 'development', '--tooldir=/export/home/ncit/external/a.mizeranschi/bcbio_nextgen/tools', '--genomes', 'hg38', '--datatarget', 'variation', '--datatarget', 'rnaseq', '--datatarget', 'smallrna', '--aligners', 'bwa', '--aligners', 'bowtie2', '--aligners', 'hisat2', '--isolate', '--cores', '2', '--data']' returned non-zero exit status 1.
amizeranschi commented 4 years ago

I also found an earlier thread (https://github.com/bcbio/bcbio-nextgen/issues/3165) where @chapmanb suggested that these errors can sometimes happen intermittently and retrying the install/upgrade can get things running.

I retried this a couple of times and each time it crashed at that particular step. I'm guessing the URL http://www.cs.jhu.edu/~genomics/GeneSplicer/GeneSplicer.tar.gz isn't working anymore and hg38 is uninstallable as a result.

naumenko-sa commented 4 years ago

Thanks, genesplicer is an easy fix - they moved to FTP server: https://github.com/chapmanb/cloudbiolinux/blob/master/ggd-recipes/hg38/genesplicer.yaml Please try again. S.

amizeranschi commented 4 years ago

Thanks for the fix. That got things moving forward, but the SnpEffect database for hg38 was taking forever to download (estimated 15 hours for 1.5 GB), so I canceled it. I don't actually need the hg38 data, anyway.

Instead, I installed the sacCer3 yeast genome, which I remembered was on the order of a couple hundred MB for all the data. Much more manageable, when its only purpose is to get the tool-data directory created.

Things worked fine and I could then install and use the custom genome. Thanks a lot for your help!