katholt / srst2

Short Read Sequence Typing for Bacterial Pathogens
Other
125 stars 65 forks source link

create_allele_pileup crashes if --output contained a directory path #26

Closed ppcherng closed 9 years ago

ppcherng commented 9 years ago

I ran this command:

"srst2 --input_pe /data/scratch/Bcereus-15/Bcereus-15_1.fastq.gz /data/scratch/Bcereus-15/Bcereus-15_2.fastq.gz --output /data/output/appresults/14091077/Bcereus-15/Bcereus-15 --save_scores --report_all_consensus --mlst_db /opt/gene_databases/mlst/11302014/Bacillus_cereus/Bacillus_cereus.fasta --mlst_definitions /opt/gene_databases/mlst/11302014/Bacillus_cereus/bcereus.txt --gene_db /opt/srst2/data/ARGannot.fasta /opt/srst2/data/PlasmidFinder.fasta /opt/gene_databases/vfdb/11302014/Bacillus/Bacillus_VF_clustered.fasta"

Note that --output contains a directory in the prefix because I want to write the outputs to that folder instead of the cwd. This apparently causes problems if you turn on --report_all_consensus because create_allele_pileup tries to just concatenate the allele name to the front of the directory name:

def create_allele_pileup(allele_name, all_pileup_file): outpileup = allele_name + "." + all_pileup_file

which results in this error:

12/05/2014 08:11:09 Processing SAMtools pileup... 12/05/2014 08:11:14 Scoring alleles... Traceback (most recent call last): File "/usr/local/bin/srst2", line 9, in load_entry_point('srst2==0.1.4', 'console_scripts', 'srst2')() File "/usr/local/lib/python2.7/dist-packages/srst2/srst2.py", line 1508, in main mlst_report, mlst_results = run_srst2(args,fileSets,args.mlst_db,"mlst") File "/usr/local/lib/python2.7/dist-packages/srst2/srst2.py", line 1074, in run_srst2 db_reports, db_results_list = process_fasta_db(args, fileSets, run_type, db_reports, db_results_list, fasta) File "/usr/local/lib/python2.7/dist-packages/srst2/srst2.py", line 1136, in process_fasta_db unique_gene_symbols, unique_allele_symbols,run_type,ST_db,results,gene_list,db_report,cluster_symbols,max_mismatch) File "/usr/local/lib/python2.7/dist-packages/srst2/srst2.py", line 1247, in map_fileSet_to_db unique_gene_symbols, unique_allele_symbols, pileup_file) File "/usr/local/lib/python2.7/dist-packages/srst2/srst2.py", line 831, in parse_scores allele_pileup_file = create_allele_pileup(top_allele, pileup_file) # XXX Creates a new pileup file for that allele. Not currently cleaned up File "/usr/local/lib/python2.7/dist-packages/srst2/srst2.py", line 737, in create_allele_pileup with open(outpileup, 'w') as allele_pileup: IOError: [Errno 2] No such file or directory: 'pta_102./data/output/appresults/14091077/Bcereus-15/Bcereus-15__Bcereus-15.Bacillus_cereus.pileup'

One possible solution is to use the os.path module to split the all_pileup_file string into the directory and filename, concatenate allele_name to the filename, and the rejoin the directory with the new filename.

katholt commented 9 years ago

Harriet could you please look at this at some point?