Error using #246

noellenoyes commented 8 years ago

I am running into errors when using the script to test my installation of iMetAMOS. The errors read:

Error, provided contig file does not exist: project dir /home/wgs/iMetAMOS2.5/metAMOS-1-2.5rc3/Test/test_ima does not exist!

I have successfully installed and tested the metAMOS installation, but none of the other optional workflows. Note that the installation of iMetAMOS seemed to go smoothly after I followed the instructions on Issue #242 (

Here is stdout when I run

Warning: Celera Assembler is not found, some functionality will not be available Warning: BLASR is not found, some functionality will not be available Warning: Newbler is not found, some functionality will not be available Warning: MetaGeneMark is not found, some functionality will not be available Warning: SignalP+ is not found, some functionality will not be available Warning: metaphylerClassify is not found, some functionality will not be available Warning: PhyloSift was not found, will not be available

Warning: REAPR is not found, some functionality will not be available Warning: FRCbam is not found, some functionality will not be available Warning: MPI is not available, some functionality may not be available Error, provided contig file does not exist: project dir /home/wgs/iMetAMOS2.5/metAMOS-1-2.5rc3/Test/test_ima does not exist! usage: runPipeline [options] -d projectdir -h = : print help [this message] -j = : just output all of the programs and citations then exit (default = NO) -v = : verbose output? (default = NO) -d = : directory created by initPipeline (REQUIRED)

[options]: [pipeline_opts] [misc_opts]

[pipeline_opts]: options that affect the pipeline execution Pipeline consists of the following steps: Preprocess, Assemble, FindORFS, MapReads, Abundance, Annotate, FunctionalAnnotation, Scaffold, Propagate, Classify, Postprocess Each of these steps can be referred to by the following options: -f = : force this step to be run (default = NONE) -s = : start at this step in the pipeline (default = Preprocess) -e = : end at this step in the pipeline (default = Postprocess) -n = : step to skip in pipeline (default=NONE)

For each step you can fine-tune the execution as follows [Preprocess] -t = : filter input reads? (default = metamos, supported = none,metamos,eautils,pbcr) -q = : produce FastQC quality report for reads with quality information (fastq or sff)? (default = NO) [Assemble] -a = : genome assembler to use (default = soapdenovo, supported = newbler,soapdenovo,soapdenovo2,ca,velvet,velvet-sc,metavelvet,metaidba,sparseassembler,minimus,abyss,edena,spades,mira,sga,idba-ud,ray,masurca) -k = : k-mer size to be used for assembly (default = 31) -o = >: min overlap length [MapReads] -m = : read mapper to use? (default = bowtie, supported = bowtie,bowtie2) -i = : save bowtie (i)ndex? (default = NO) -b = : create library specific per bp coverage of assembled contigs (default = NO) [FindORFS] -g = : gene caller to use (default = fraggenescan, supported = fraggenescan,metagenemark,glimmermg) -l = : min contig length to use for ORF call (default = 300) -x = >: min contig coverage to use for ORF call (default = 3X) [Validate] -X = : comma-separated list of validators to run on the assembly. (default = lap, supported = reapr,orf,lap,ale,quast,frcbam,freebayes,cgal,n50) -S = : comma-separated list of scores to use to select the winning assembly. By default, all validation tools specified by -X will be run. For each score, an optional weight can be specified as SCORE:WEIGHT. For example, LAP:1,CGAL:2 (supported = all,lap,ale,cgal,snp,frcbam,orf,reapr,n50) [Annotate] -c = : classifier to use for annotation (default = kraken, supported = fcp,phylosift,phmmer,blast,metaphyler,phymm,kraken -u = : annotate unassembled reads? (default = NO) [Classify] -z = : taxonomic level to categorize at (default = class)

[misc_opts]: Miscellaneous options -B = : blast DBs not available (default = NO) -r = : retain the AMOS bank? (default = NO) -p = : number of threads to use (be greedy!) (default=1) -4 = : 454 data? (default = NO) -L = : generate local Krona plots. Local Krona plots can only be viewed on the machine they are generated on but will work on a system with no internet connection (default = NO)

Perhaps this is linked to the ftp problem referenced in Issue #242? Any help you can provide would be much appreciated!

skoren commented 8 years ago

NCBI has completely changed their FTP site invalidating that link. If you change test_ima.ini from to it should work.

noellenoyes commented 8 years ago

Thanks for the quick response, that seemed to fix the original error. However, I think that changing the file name is causing an additional error:

Oops, MetAMOS finished with errors! see text in red above for details. Traceback (most recent call last): File "../runPipeline", line 985, in verbose = 1) File "/home/wgs/iMetAMOS2.5/metAMOS-1-2.5rc3/Utilities/ruffus/", line 2965, in pipeline_run raise job_errors RethrownJobError:

Exception #1
  'exceptions.ValueError(too many values to unpack)' raised in ...
   Task = def assemble.Assemble(...):
   Job  = [ -> GCA_000010365.1_ASM1036v1_genomic.fna.asm.contig]

Traceback (most recent call last):
  File "/home/wgs/iMetAMOS2.5/metAMOS-1-2.5rc3/Utilities/ruffus/", line 625, in run_pooled_job_without_exceptions
    return_value =  job_wrapper(param, user_defined_work_func, register_cleanup, touch_files_only)
  File "/home/wgs/iMetAMOS2.5/metAMOS-1-2.5rc3/Utilities/ruffus/", line 491, in job_wrapper_io_files
    ret_val = user_defined_work_func(*param)
  File "/home/wgs/iMetAMOS2.5/metAMOS-1-2.5rc3/src/", line 326, in Assemble
    (asmName, kmer) = asmName.split(".")
ValueError: too many values to unpack

Here is the entire output of the run:

Warning: Celera Assembler is not found, some functionality will not be available Warning: BLASR is not found, some functionality will not be available Warning: Newbler is not found, some functionality will not be available Warning: MetaGeneMark is not found, some functionality will not be available Warning: SignalP+ is not found, some functionality will not be available Warning: metaphylerClassify is not found, some functionality will not be available Warning: PhyloSift was not found, will not be available

Warning: REAPR is not found, some functionality will not be available Warning: FRCbam is not found, some functionality will not be available Warning: MPI is not available, some functionality may not be available Project directory already exists, please specify another Alternatively, use runPipeline to run an existing project

Starting metAMOS pipeline Found pysam in /home/wgs/iMetAMOS2.5/metAMOS-1-2.5rc3/Utilities/python/lib/python/pysam-0.6-py2.7-linux-x86_64.egg/pysam/init.pyc Found psutil in /home/wgs/iMetAMOS2.5/metAMOS-1-2.5rc3/Utilities/python/lib/python/psutil-0.6.1-py2.7-linux-x86_64.egg/psutil/init.pyc Error: cannot find BLAST DB directory, expected it in /home/wgs/iMetAMOS2.5/metAMOS-1-2.5rc3/Utilities/DB/. Disabling blastdb dependent programs Warning: Celera Assembler is not found, some functionality will not be available Warning: BLASR is not found, some functionality will not be available Warning: Newbler is not found, some functionality will not be available Warning: MetaGeneMark is not found, some functionality will not be available Warning: SignalP+ is not found, some functionality will not be available Warning: metaphylerClassify is not found, some functionality will not be available Warning: PhyloSift was not found, will not be available

Warning: REAPR is not found, some functionality will not be available Warning: FRCbam is not found, some functionality will not be available Warning: MPI is not available, some functionality may not be available [Available RAM: 527 GB] ok [Available CPUs: 64] ok

Tasks which will be run:

Task = assemble.Assemble Task = assemble.CheckAsmResults Task = assemble.SplitMappers Task = mapreads.MapReads Task = mapreads.CheckMapResults Task = mapreads.SplitForORFs Task = findorfs.FindORFS Task = validate.Validate Task = findreps.FindRepeats Task = annotate.Annotate Task = fannotate.FunctionalAnnotation Task = scaffold.Scaffold Task = findscforfs.FindScaffoldORFS Task = abundance.Abundance Task = propagate.Propagate Task = classify.Classify Task = postprocess.Postprocess

metAMOS configuration summary: metAMOS Version: v1.5rc3 "Praline Brownie" workflows: core,imetamos Time and Date: 2016-07-12 Working directory: /home/wgs/iMetAMOS2.5/metAMOS-1-2.5rc3/Test/test_ima Prefix: proba K-Mer: 31 Threads: 8 Taxonomic level: phylum Verbose: True Steps to skip: FindScaffoldORFS, Scaffold, Propagate, FindRepeats Steps to force: FunctionalAnnotation, Postprocess

Any advice you can provide would be much appreciated, thanks!

skoren commented 8 years ago

Yes, there was a bug handling gz input assemblies. It should be fixed if you pull 1.5rc3 from the repo again.