Closed ewilbanks closed 11 years ago
Hi,
This looks like an error with the naming of the sequences. If you enable the filtering option with the -t option to runPipeline, it should rename all the reads and eliminate the error. Unfortunately, this means you need to re-run the pipeline from scratch. As for FindORFS, currently, metAMOS will try to call ORFs on all sequences that could not be mapped to your assembly. However, this is time consuming for large datasets. Thus, we are planning to update the code to disable gene calling on the sequences by default. The change should be available in the next week, you can also disable ORF calling on unmapped sequences yourself by removing lines 310-312 in src/findorfs.py.
for lib in _readlibs:
run_process(_settings, "ln -s %s/Assemble/out/lib%d.unaligned.fasta %s/FindORFS/in/"%(_settings.rundir,lib.id,_settings.rundir),"FindORFS")
findFastaORFs(_orf, "%s/FindORFS/in/lib%d.unaligned.fasta"%(_settings.rundir, lib.id), "%s.lib%d.fna"%(_settings.PREFIX, lib.id), "%s.lib%d.faa"%(_settings.PREFIX, lib.id), "%s.lib%d.gene.cvg"%(_settings.PREFIX, lib.id), "%s.lib%d.gene.map"%(_settings.PREFIX, lib.id), 0, 1)
Sergey
OK, thanks! What exactly does -t option do? I couldn't really tell from the documentation I read through.
Re-ran including -t and got a similar error
$ ~/software/metAMOS/runPipeline -v -t -p 14 -n Assemble,FindRepeats,FindORFS,Annotate,FunctionalAnnotation,Classify,Propagate -d /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1
*\ metAMOS running command: rm -rf /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Scaffold/in/proba.bnk
*\ metAMOS running command: /home/ewilbanks/software/metAMOS/AMOS/Linux-x86_64/bin/toAmos_new -Q /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Preprocess/out/lib1.seq -i --min 200 --max 1000 --libname lib1 -b /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Scaffold/in/proba.bnk
****ERROR****** During scaffold, the following command failed with return code -6:
/home/ewilbanks/software/metAMOS/AMOS/Linux-x86_64/bin/toAmos_new -Q /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Preprocess/out/lib1.seq -i --min 200 --max 1000 --libname lib1 -b /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Scaffold/in/proba.bnk
****DETAILS****** Last 10 commands run before the error (/share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Logs/COMMANDS.log) |2013-05-07 10:59:45| touch /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Logs/findrepeats.skip |2013-05-07 10:59:46|# [ANNOTATE] |2013-05-07 11:00:01| touch /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Logs/annotate.skip |2013-05-07 11:00:17| touch /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Annotate/out/proba.hits |2013-05-07 11:00:18|# [FUNCTIONALANNOTATION] |2013-05-07 11:00:35| touch /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Logs/functionalannotation.skip |2013-05-07 11:00:54| touch /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/FunctionalAnnotation/out/blast.out |2013-05-07 11:00:55|# [SCAFFOLD] |2013-05-07 11:01:16| rm -rf /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Scaffold/in/proba.bnk |2013-05-07 11:01:38| /home/ewilbanks/software/metAMOS/AMOS/Linux-x86_64/bin/toAmos_new -Q /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Preprocess/out/lib1.seq -i --min 200 --max 1000 --libname lib1 -b /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Scaffold/in/proba.bnk
Last 10 lines of output (/share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Logs/SCAFFOLD.log) Read Bank doesn't exist; creating frag Bank doesn't exist; creating lib Bank doesn't exist; creating parsing fastq file terminate called after throwing an instance of 'AMOS::ArgumentException_t' what(): Cannot insert string key 'M01533:9:000000000-A20UG:1:1101:16050:1572' multiple times
Please veryify input data and restart MetAMOS. If the problem persists please contact the MetAMOS development team. ****ERROR******
rm: cannot remove `/share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Logs/scaffold.ok': No such file or directory ruffus.ruffus_exceptions.RethrownJobError:
Exception #1
'exceptions.NameError(global name 'JobSignalledBreak' is not defined)' raised in ...
Task = def scaffold.Scaffold(...):
Job = [[proba.asm.contig] -> proba.scaffolds.final]
Traceback (most recent call last):
File "/home/ewilbanks/software/metAMOS/Utilities/ruffus/task.py", line 616, in run_pooled_job_without_exceptions
return_value = job_wrapper(param, user_defined_work_func, register_cleanup, touch_files_only)
File "/home/ewilbanks/software/metAMOS/Utilities/ruffus/task.py", line 486, in job_wrapper_io_files
ret_val = user_defined_work_func(*param)
File "/home/ewilbanks/software/metAMOS/src/scaffold.py", line 71, in Scaffold
run_process(_settings, "%s/toAmos_new -Q %s/Preprocess/out/lib%d.seq %s -b %s/Scaffold/in/%s.bnk "%(_settings.AMOS,_settings.rundir,lib.id,matedStr,_settings.rundir,_settings.PREFIX),"Scaffold")
File "/home/ewilbanks/software/metAMOS/src/utils.py", line 608, in run_process
raise (JobSignalledBreak)
NameError: global name 'JobSignalledBreak' is not defined
Hi,
Sorry I didn't specify this in my previous response. The -t option will filter sequences containing Ns and rename the sequences to a standard naming convention. Lots of the tools within metAMOS don't support arbitrary names so this filtering helps avoid errors. My guess is that your read name is of the form SEQUENCE_ID 1 or SEQUENCE_ID 2, depending on which paired end it is. AMOS does not understand spaces in the sequence names so SEQUENCE_ID 1 and SEQUENCE_ID 2 look like the same thing. The filter step would rename these to be SEQUENCE_ID/1 and SEQUENCE_ID/2.
The error you got above is because metAMOS tries to resume a previously started run. You need to either remove your working directory (mol.celera1) and re-create it with initPipeline or force Preprocess (-f Preprocess) to make the filtering run. This will re-start metAMOS from scratch.
Closing due to inactivity
Hi folks,
I'm having an issue running metamos on our linux 64bit machine. Running the following command, my run died after chugging away at FragGeneScan for several days. The command I ran was: ~/software/metAMOS/runPipeline -v -p 8 -n Assemble,FindRepeats -d /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1
I'm wondering if the error is related to this previously noted issue? http://github.com/treangen/metAMOS/issues/53
Output error is below - let me know if any other files or info would be helpful!
|2013-05-05 16:43:08| mv /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/FindORFS/out/proba.gene.cvg /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/FindORFS/out/proba.lib2.gene.cvg |2013-05-05 16:43:24| mv /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/FindORFS/out/proba.gene.map /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/FindORFS/out/proba.lib2.gene.map |2013-05-05 16:43:39| rm -r /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/FindORFS/out/proba.fna |2013-05-05 16:49:37| cat /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/FindORFS/out/proba*.fna > /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/FindORFS/out/proba.fna |2013-05-05 16:49:52| rm -r /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/FindORFS/out/proba.fna.bnk |2013-05-05 16:50:10| /home/ewilbanks/software/metAMOS/AMOS/Linux-x86_64/bin/toAmos_new -s /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/FindORFS/out/proba.fna -b /share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/FindORFS/out/proba.fna.bnk
Last 10 lines of output (/share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Logs/FINDORFS.log) no. of seqs: 17125850 no. of seqs: 169603267 rm: cannot remove
/share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/FindORFS/out/proba.fna': No such file or directory rm: cannot remove
/share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/FindORFS/out/proba.fna.bnk': No such file or directory Read Bank doesn't exist; creating frag Bank doesn't exist; creating lib Bank doesn't exist; creating parsing fasta file terminate called after throwing an instance of 'AMOS::ArgumentException_t' what(): Cannot insert string key 'M01533:9:000000000-A20UG:1:1101:17058:1598_1260-' multiple timesPlease veryify input data and restart MetAMOS. If the problem persists please contact the MetAMOS development team. ****ERROR******
rm: cannot remove `/share/eisen-z2/ewilbanks/Moleculo/metamos/mol.celera1/Logs/findorfs.ok': No such file or directory ruffus.ruffus_exceptions.RethrownJobError: