marbl / metAMOS

A metagenomic and isolate assembly and analysis pipeline built with AMOS
http://marbl.github.io/metAMOS
Other
93 stars 45 forks source link

RunPipeline fails because "project dir does not exist" #259

Open KDSR opened 7 years ago

KDSR commented 7 years ago

RunPipeline command keeps failing, saying that project dir does not exist (even though the initPipeline completes and states that the directory was successfully created). I have used the initPipeline and runPipeline on the same server in the past, so I'm not sure what is going on now.

An example of the initPipeline: stevens.txbiomedgenetics.org% initPipeline -q -1 MicrobiomeDnaSeq_S17_L004_R1_001.fastq.gz -2 MicrobiomeDnaSeq_S17_L004_R2_001.fastq.gz -d 20170201_microbiome_test_S17 -i 200:800 Warning: BLASR is not found, some functionality will not be available Warning: Newbler is not found, some functionality will not be available Warning: MetaGeneMark is not found, some functionality will not be available Warning: SignalP+ is not found, some functionality will not be available Warning: metaphylerClassify is not found, some functionality will not be available Warning: PHmmer is not found, some functionality will not be available Warning: PhyloSift was not found, will not be available

Warning: EA-UTILS is not found, some functionality will not be available Warning: ALE is not found, some functionality will not be available Warning: CGAL is not found, some functionality will not be available Warning: REAPR is not found, some functionality will not be available Warning: FRCbam is not found, some functionality will not be available Warning: FreeBayes is not found, some functionality will not be available Warning: QUAST is not found, some functionality will not be available Warning: MPI is not available, some functionality may not be available Project dir /master/kreeves/metAMOS-1.5rc3/20170201_microbiome_test_S17 successfully created! Use runPipeline.py to start Pipeline

An example of the runPipeline: stevens.txbiomedgenetics.org% runPipeline –q –u –r –v –c kraken –a SOAPdenovo,soap,soap2,velvet,metavelvet,velvet-sc,spades,abyss,ray,edena,sga,masurca –t metamos –n Scaffold –f FunctionalAnnotation –z species –d 20170201_microbiome_test_S17 project dir does not exist! usage: runPipeline [options] -d projectdir -h = : print help [this message] -j = : just output all of the programs and citations then exit (default = NO) -v = : verbose output? (default = NO) -d = : directory created by initPipeline (REQUIRED)

[options]: [pipeline_opts] [misc_opts]

[pipeline_opts]: options that affect the pipeline execution Pipeline consists of the following steps: Preprocess, Assemble, FindORFS, MapReads, Abundance, Annotate, FunctionalAnnotation, Scaffold, Propagate, Classify, Postprocess Each of these steps can be referred to by the following options: -f = : force this step to be run (default = NONE) -s = : start at this step in the pipeline (default = Preprocess) -e = : end at this step in the pipeline (default = Postprocess) -n = : step to skip in pipeline (default=NONE)

For each step you can fine-tune the execution as follows [Preprocess] -t = : filter input reads? (default = metamos, supported = none,metamos,eautils,pbcr) -q = : produce FastQC quality report for reads with quality information (fastq or sff)? (default = NO) [Assemble] -a = : genome assembler to use (default = soapdenovo, supported = newbler,soapdenovo,soapdenovo2,ca,velvet,velvet-sc,metavelvet,metaidba,sparseassembler,minimus,abyss,edena,spades,mira,sga,idba-ud,ray,masurca) -k = : k-mer size to be used for assembly (default = 31) -o = >: min overlap length [MapReads] -m = : read mapper to use? (default = bowtie, supported = bowtie,bowtie2) -i = : save bowtie (i)ndex? (default = NO) -b = : create library specific per bp coverage of assembled contigs (default = NO) [FindORFS] -g = : gene caller to use (default = fraggenescan, supported = fraggenescan,metagenemark,glimmermg) -l = : min contig length to use for ORF call (default = 300) -x = >: min contig coverage to use for ORF call (default = 3X) [Validate] -X = : comma-separated list of validators to run on the assembly. (default = lap, supported = reapr,orf,lap,ale,quast,frcbam,freebayes,cgal,n50) -S = : comma-separated list of scores to use to select the winning assembly. By default, all validation tools specified by -X will be run. For each score, an optional weight can be specified as SCORE:WEIGHT. For example, LAP:1,CGAL:2 (supported = all,lap,ale,cgal,snp,frcbam,orf,reapr,n50) [Annotate] -c = : classifier to use for annotation (default = kraken, supported = fcp,phylosift,phmmer,blast,metaphyler,phymm,kraken -u = : annotate unassembled reads? (default = NO) [Classify] -z = : taxonomic level to categorize at (default = class)

[misc_opts]: Miscellaneous options -B = : blast DBs not available (default = NO) -r = : retain the AMOS bank? (default = NO) -p = : number of threads to use (be greedy!) (default=1) -4 = : 454 data? (default = NO) -L = : generate local Krona plots. Local Krona plots can only be viewed on the machine they are generated on but will work on a system with no internet connection (default = NO) stevens.txbiomedgenetics.org% runPipeline –q –u –r –v –c kraken –t metamos –n Scaffold –f FunctionalAnnotation –z species –d 20170201_microbiome_test_S17 project dir does not exist! usage: runPipeline [options] -d projectdir -h = : print help [this message] -j = : just output all of the programs and citations then exit (default = NO) -v = : verbose output? (default = NO) -d = : directory created by initPipeline (REQUIRED)

[options]: [pipeline_opts] [misc_opts]

[pipeline_opts]: options that affect the pipeline execution Pipeline consists of the following steps: Preprocess, Assemble, FindORFS, MapReads, Abundance, Annotate, FunctionalAnnotation, Scaffold, Propagate, Classify, Postprocess Each of these steps can be referred to by the following options: -f = : force this step to be run (default = NONE) -s = : start at this step in the pipeline (default = Preprocess) -e = : end at this step in the pipeline (default = Postprocess) -n = : step to skip in pipeline (default=NONE)

For each step you can fine-tune the execution as follows [Preprocess] -t = : filter input reads? (default = metamos, supported = none,metamos,eautils,pbcr) -q = : produce FastQC quality report for reads with quality information (fastq or sff)? (default = NO) [Assemble] -a = : genome assembler to use (default = soapdenovo, supported = newbler,soapdenovo,soapdenovo2,ca,velvet,velvet-sc,metavelvet,metaidba,sparseassembler,minimus,abyss,edena,spades,mira,sga,idba-ud,ray,masurca) -k = : k-mer size to be used for assembly (default = 31) -o = >: min overlap length [MapReads] -m = : read mapper to use? (default = bowtie, supported = bowtie,bowtie2) -i = : save bowtie (i)ndex? (default = NO) -b = : create library specific per bp coverage of assembled contigs (default = NO) [FindORFS] -g = : gene caller to use (default = fraggenescan, supported = fraggenescan,metagenemark,glimmermg) -l = : min contig length to use for ORF call (default = 300) -x = >: min contig coverage to use for ORF call (default = 3X) [Validate] -X = : comma-separated list of validators to run on the assembly. (default = lap, supported = reapr,orf,lap,ale,quast,frcbam,freebayes,cgal,n50) -S = : comma-separated list of scores to use to select the winning assembly. By default, all validation tools specified by -X will be run. For each score, an optional weight can be specified as SCORE:WEIGHT. For example, LAP:1,CGAL:2 (supported = all,lap,ale,cgal,snp,frcbam,orf,reapr,n50) [Annotate] -c = : classifier to use for annotation (default = kraken, supported = fcp,phylosift,phmmer,blast,metaphyler,phymm,kraken -u = : annotate unassembled reads? (default = NO) [Classify] -z = : taxonomic level to categorize at (default = class)

[misc_opts]: Miscellaneous options -B = : blast DBs not available (default = NO) -r = : retain the AMOS bank? (default = NO) -p = : number of threads to use (be greedy!) (default=1) -4 = : 454 data? (default = NO) -L = : generate local Krona plots. Local Krona plots can only be viewed on the machine they are generated on but will work on a system with no internet connection (default = NO)

Any help or guidance will be appreciated!

Thanks, Kim

mmelendrez commented 7 years ago

Hi - I was getting this same error. I found this issue related to testing but also had errors such that the project or file or something did not exist: https://github.com/marbl/metAMOS/issues/180 python issues caught my eye in the issue. I have version 3.5, I went back a version Python 2.7 and now the pipeline is running. Syntax between the two versions is different. So maybe try checking your python install/version? Erase what you've created and rerun initPipeline, then runPipeline on that directory. That's what worked for me.