hallamlab / MetaPathways

A modular pipeline for constructing Pathway/Genome Databases from environmental sequence information
http://hallam.microbiology.ubc.ca/MetaPathways
12 stars 7 forks source link

Grid does not consolidate results... sometimes #13

Closed hallamlab closed 11 years ago

hallamlab commented 11 years ago

This should be fixed ASAP as it kills the flow of the pipeline. The funny thing is that this only sometimes happens. Particularly on large samples.

In some cases the split .blastout files on the grid are not collected/concatenated into a complete .blastout file and transferred back to the original machine. This causes the parse-blast step in the pipeline to fail because there is no resulting .blastout file transferred yet.

It also seems that the variable controlling the number of splits is not reset after each database run. See below: Number of sequence files created :686 Number of sequence files created :1372 Number of sequence files created :2058 Number of sequence files created :2744

6. Blasting  ORFs against reference database - metacyc-v4-2011-07-03 ....
     Server bugaboo.westgrid.ca is working 
     Successfully copied daemon script
     'MetaPathways'
                 found
     'MetaPathways/databases'
                 found
     'MetaPathways/executables'
                 found
     MetaPathways/executables/blastp
                 found
     MetaPathways/executables/formatdb
                 found
     MetaPathways/databases/metacyc-v4-2011-07-03
                 found
                 already formatted
     Sample  folder LTSP_ends
                 NOT found
                 created!
     MetaPathways/LTSP_ends/.qstatdir
                 NOT found
                 created!
     MetaPathways/LTSP_ends/LTSP_ends.qced.faa
                 NOT found
                 copied
     Number of sequence files created :686

[ -------------------------------------------------------------------------------------------        ]

Issuing Command : {'run_type': 'overlay', 'dbnames': ['kegg'], 'sample_name': 'output//LTSP_ends', 'max_parallel_jobs': '400', 'server': 'bugaboo.westgrid.ca', 'database_files': ['kegg-pep-2011-06-18'], 'batch_size': '200', 'user': 'ahahn'}

                                               kegg-pep-2011-06-18 ....
     Server bugaboo.westgrid.ca is working 
     Successfully copied daemon script
     'MetaPathways'
                 found
     'MetaPathways/databases'
                 found
     'MetaPathways/executables'
                 found
     MetaPathways/executables/blastp
                 found
     MetaPathways/executables/formatdb
                 found
     MetaPathways/databases/kegg-pep-2011-06-18
                 found
                 already formatted
     Sample  folder LTSP_ends
                 found
     MetaPathways/LTSP_ends/.qstatdir
                 found
     MetaPathways/LTSP_ends/LTSP_ends.qced.faa
                 found
     Number of sequence files created :1372

[ ------------------------------------------------------------------------------                     ]

Issuing Command : {'run_type': 'overlay', 'dbnames': ['cog'], 'sample_name': 'output//LTSP_ends', 'max_parallel_jobs': '400', 'server': 'bugaboo.westgrid.ca', 'database_files': ['cog-2007-10-30'], 'batch_size': '200', 'user': 'ahahn'}

                                               cog-2007-10-30 ....
     Server bugaboo.westgrid.ca is working 
     Successfully copied daemon script
     'MetaPathways'
                 found
     'MetaPathways/databases'
                 found
     'MetaPathways/executables'
                 found
     MetaPathways/executables/blastp
                 found
     MetaPathways/executables/formatdb
                 found
     MetaPathways/databases/cog-2007-10-30
                 found
                 already formatted
     Sample  folder LTSP_ends
                 found
     MetaPathways/LTSP_ends/.qstatdir
                 found
     MetaPathways/LTSP_ends/LTSP_ends.qced.faa
                 found
     Number of sequence files created :2058

[ ----------------------------------------------------------------------------------------------     ]

Issuing Command : {'run_type': 'overlay', 'dbnames': ['refseq'], 'sample_name': 'output//LTSP_ends', 'max_parallel_jobs': '400', 'server': 'bugaboo.westgrid.ca', 'database_files': ['refseq_protein-2009-04-27'], 'batch_size': '200', 'user': 'ahahn'}

                                               refseq_protein-2009-04-27 ....
     Server bugaboo.westgrid.ca is working 
     Successfully copied daemon script
     'MetaPathways'
                 found
     'MetaPathways/databases'
                 found
     'MetaPathways/executables'
                 found
     MetaPathways/executables/blastp
                 found
     MetaPathways/executables/formatdb
                 found
     MetaPathways/databases/refseq_protein-2009-04-27
                 found
                 already formatted
     Sample  folder LTSP_ends
                 found
     MetaPathways/LTSP_ends/.qstatdir
                 found
     MetaPathways/LTSP_ends/LTSP_ends.qced.faa
                 found
     Number of sequence files created :2744

[ ----------------------------------------------------------------------------------------           ]

Issuing Command : /Users/ariahahn/Desktop/MetaPathways/libs/python_scripts/MetaPathways_parse_blast.py -d metacyc  -b output//LTSP_ends/blast_results//LTSP_ends.metacyc.blastout -m /Users/ariahahn/Desktop/MetaPathways/blastDB//metacyc-v4-2011-07-03-names.txt  -r  output//LTSP_ends/blast_results//LTSP_ends.refscores  --min_bsr 0.400000  --min_score 20.000000 --min_length 60.000000 --max_evalue 0.000001

7. Parsing blast outputs for reference database - metacyc ......... Error!
Traceback (most recent call last):
  File "/Users/ariahahn/Desktop/MetaPathways/libs/python_scripts/MetaPathways_parse_blast.py", line 361, in <module>
    main(sys.argv[1:])
  File "/Users/ariahahn/Desktop/MetaPathways/libs/python_scripts/MetaPathways_parse_blast.py", line 355, in main
    process_blastoutput( dbname, blastoutput,  mapfile, opts.refscore_file, opts)
  File "/Users/ariahahn/Desktop/MetaPathways/libs/python_scripts/MetaPathways_parse_blast.py", line 316, in process_blastoutput
    blastparser =  BlastOutputParser(dbname, blastoutput, mapfile, refscore_file, cutoffs)
  File "/Users/ariahahn/Desktop/MetaPathways/libs/python_scripts/MetaPathways_parse_blast.py", line 149, in __init__
    self.blastoutputfile = open( blastoutput,'r')
IOError: [Errno 2] No such file or directory: 'output//LTSP_ends/blast_results//LTSP_ends.metacyc.blastout'
Error! : Traceback (most recent call last):
  File "/Users/ariahahn/Desktop/MetaPathways/libs/python_scripts/MetaPathways_parse_blast.py", line 361, in <module>
    main(sys.argv[1:])
  File "/Users/ariahahn/Desktop/MetaPathways/libs/python_scripts/MetaPathways_parse_blast.py", line 355, in main
    process_blastoutput( dbname, blastoutput,  mapfile, opts.refscore_file, opts)
  File "/Users/ariahahn/Desktop/MetaPathways/libs/python_scripts/MetaPathways_parse_blast.py", line 316, in process_blastoutput
    blastparser =  BlastOutputParser(dbname, blastoutput, mapfile, refscore_file, cutoffs)
  File "/Users/ariahahn/Desktop/MetaPathways/libs/python_scripts/MetaPathways_parse_blast.py", line 149, in __init__
    self.blastoutputfile = open( blastoutput,'r')
IOError: [Errno 2] No such file or directory: 'output//LTSP_ends/blast_results//LTSP_ends.metacyc.blastout'