hallamlab / MetaPathways

A modular pipeline for constructing Pathway/Genome Databases from environmental sequence information
http://hallam.microbiology.ubc.ca/MetaPathways
12 stars 7 forks source link

input/output & gird confusion #11

Closed hallamlab closed 11 years ago

hallamlab commented 11 years ago

There seems to be confusion between the input sample name and the output folder names that are used in MetaPathways partciularly when doing basting on the grid.

Here we specified an exact .fasta file instead of just an input directory. The output directory was also a little ambiguous (but this is something normal people are going to do). This caused a float division error when it got to the grid. We have to think of an elegant want to get around this if the user doesn't specify an exact directory or we have to make it clear that when you specify an exact input file we also expect an exact output directory (not creating a new subdirectory).

The Original Command:

ariahahn@showgirl:~/Desktop/MetaPathways $ metapathways.py -i input/Fosmid_Ends_WL_Jan17/LTSP_ends.fasta -o output -p template_param.txt -v -r overlay
True True
WARNING: Refseq BLAST output output/blast_results//LTSP_ends.refseq.blastout not found!
         Will have to Skip taxonomic information in annotation table :(

  ********************************************************** 
  ********************   Running  metapaths ******************* 
  ********************************************************** 
              input/Fosmid_Ends_WL_Jan17/LTSP_ends.fasta                       
  ********************************************************** 

The Error:

Issuing Command : {'run_type': 'overlay', 'dbnames': ['metacyc'], 'sample_name': 'output', 'max_parallel_jobs': '400', 'server': 'bugaboo.westgrid.ca', 'database_files': ['metacyc-v4-2011-07-03'], 'batch_size': '200', 'user': 'ahahn'}

6. Blasting  ORFs against reference database - metacyc-v4-2011-07-03 ....
     Server bugaboo.westgrid.ca is working 
     Successfully copied daemon script
     'MetaPathways'
                 found
     'MetaPathways/databases'
                 found
     'MetaPathways/executables'
                 found
     MetaPathways/executables/blastp
                 found
     MetaPathways/executables/formatdb
                 found
     MetaPathways/databases/metacyc-v4-2011-07-03
                 found
                 already formatted
     Sample  folder output
                 NOT found
                 created!
     MetaPathways/output/.qstatdir
                 NOT found
                 created!
     MetaPathways/output/output.qced.faa
                 NOT found
                 copied
     Number of sequence files created :0

[ Traceback (most recent call last):                                                                 ]
  File "./metapathways.py", line 233, in <module>
    main(sys.argv[1:])    
  File "./metapathways.py", line 228, in main
    run_type = run_type
  File "/Users/ariahahn/Desktop/MetaPathways/libs/python_modules/metapaths.py", line 1187, in run_metapathways
    command_handler(commands, status_update_callback, logger, stepslogger, command_line_params)
  File "/Users/ariahahn/Desktop/MetaPathways/libs/python_modules/metapaths_pipeline.py", line 147, in call_commands_serially
    blastgrid(c[1])
  File "/Users/ariahahn/Desktop/MetaPathways/libs/python_modules/BlastGrid.py", line 528, in blastgrid
    current = int((completedsamples*100)/numsamples)
ZeroDivisionError: float division

We also noticed that there was an "output" directory on westgrid where we expected a sample name directory because it got confused between directory and sample name.

nielshanson commented 11 years ago

A change with Kishori about how input and output directories are specified probably fixed this problem. This change will be included in next commit.