phac-nml / staramr

Scans genome contigs against the ResFinder, PlasmidFinder, and PointFinder databases.
Apache License 2.0
111 stars 25 forks source link

better error messages for empty results list #155

Closed bgruening closed 1 year ago

bgruening commented 1 year ago

I assume the following traceback is due to the fact that there are no BLAST results. An improved error message could help here to guide the user

Traceback (most recent call last):
  File "/usr/local/tools/_conda/envs/__staramr@0.7.2/bin/staramr", line 68, in <module>
    args.run_command(args)
  File "/usr/local/tools/_conda/envs/__staramr@0.7.2/lib/python3.7/site-packages/staramr/subcommand/Search.py", line 468, in run
    unacceptable_num_contigs= args.unacceptable_num_contigs)
  File "/usr/local/tools/_conda/envs/__staramr@0.7.2/lib/python3.7/site-packages/staramr/subcommand/Search.py", line 286, in _generate_results
    report_all_blast, ignore_invalid_files, mlst_scheme)
  File "/usr/local/tools/_conda/envs/__staramr@0.7.2/lib/python3.7/site-packages/staramr/detection/AMRDetection.py", line 177, in run_amr_detection
    self._amr_detection_handler.run_blasts_mlst(files, mlst_scheme)
  File "/usr/local/tools/_conda/envs/__staramr@0.7.2/lib/python3.7/site-packages/staramr/blast/JobHandler.py", line 97, in run_blasts_mlst
    db_files = self._make_db_from_input_files(self._input_genomes_tmp_dir, files)
  File "/usr/local/tools/_conda/envs/__staramr@0.7.2/lib/python3.7/site-packages/staramr/blast/JobHandler.py", line 137, in _make_db_from_input_files
    future_blastdb.result()
  File "/usr/local/tools/_conda/envs/__staramr@0.7.2/lib/python3.7/concurrent/futures/_base.py", line 435, in result
    return self.__get_result()
  File "/usr/local/tools/_conda/envs/__staramr@0.7.2/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/usr/local/tools/_conda/envs/__staramr@0.7.2/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/tools/_conda/envs/__staramr@0.7.2/lib/python3.7/site-packages/staramr/blast/JobHandler.py", line 289, in _make_blast_db
    err_msg = re.findall('REF\|(.*?)\'', err_msg)[0]
IndexError: list index out of range
apetkau commented 1 year ago

Thanks so much for reporting @bgruening . Your assessment makes sense. A better error message would be useful here 😄

I'll see if I can reproduce it.

bgruening commented 1 year ago

Thanks @apetkau!

emarinier commented 1 year ago

I can't seem to recreate this error in the newest version (38f24ad). Let me know if you want to me to try with the same data that gave you the error.

Example with BLAST Results:

rm -rf out-blast out-sum
staramr search --output-hits-dir out-blast --output-summary out-sum staramr/tests/integration/data/16S_gyrA_beta-lactam.fsa

Output Summary (out-sum):

Isolate ID      Quality Module  Genotype        Predicted Phenotype     Plasmid Scheme  Sequence Type   Genome Length   N50 value       Number of Contigs Greater Than Or Equal To 300 bp       Quality Module Feedback
16S_gyrA_beta-lactam    Failed  blaIMP-42       ampicillin, amoxicillin/clavulanic acid, cefoxitin, ceftriaxone, meropenem      None    -       -       5220    5220    1       Genome length is not within the acceptable length range [4000000,6000000] ; N50 value is not greater than the specified minimum value [10000]

BLAST Output (out-blast/resfinder_16S_gyrA_beta-lactam.fsa):

>blaIMP-42_1_AB753456 isolate: 16S_gyrA_beta-lactam, contig: 16S_rrsD, contig_start: 4381, contig_end: 5121, database_gene_start: 1, database_gene_end: 741, hsp/length: 741/741, pid: 99.73%, plength: 100.00%
ATGAGCAAGTTATCTGCATTCTTTATATTTTTGTTTTGCAGCATTGATACCGCAGCAGAG
TCTTTGCCAGATTTAAAAATTGAAAAGCTTGATGAAGGCGTTTATGTTCATACTTCGTTT
GAAGAAGTTAACAGGTGGGGCGTTGTTCCTAAACATGGTTTGGTGGTTCTTGTAAATGCT
GAGGCTTACCTAATTGACACTCCATTTACGGCTAAAGATACTGAAAAGTTAGTCACTTGG
TTTGTGGAGCGTGGCTATAAAATAAAAGGCAGCATTTCCTCTCATTTTCATAGCGACAGC
ACGGGCGGAATAGAGTGGCTTAATTCTCGATCTATCCCCACGTATGCATCTGAATTAACA
AATGAACTGCTTAAAAAAGACGGTAAGGTTCAAGCCACAAATTCATTTAGCGGAGTTAAC
TATTGGCTAGTTAAAAATAAAATTGAAGTTTTTTATCCAGGCCCGGGACACACTCCAGAT
AACGTAGTGGTTTGGTTGCCTGAAAGGAAAATATTATTCGGTGGTTGTTTTATTAAACCG
TACGGTTTAGGCAATTTGGGTGACGCAAATATAGAAGCTTGGCCAAAGTCCGCCAAATTA
TTAAAGTCCAAATATGGTAAGGCAAAACTGGTTGTTCCAAGTCACAGTGAAGTTGGAGAC
GCATCACTCTTGAAACTTACATTAGAGCAGGCGGTTAAAGGGTTAAACGAAAGTAAAAAA
CCATCAAAACCAAGCAACTAA

Example with NO BLAST Results:

too-short.fsa:

>SHORT
A
rm -rf out-blast out-sum
staramr search --output-hits-dir out-blast --output-summary out-sum too-short.fsa

Output Summary (out-sum):

Isolate ID      Quality Module  Genotype        Predicted Phenotype     Plasmid Scheme  Sequence Type   Genome Length   N50 value       Number of Contigs Greater Than Or Equal To 300 bp       Quality Module Feedback
too-short       Failed  None    Sensitive       None    -       -       1       1       0       Genome length is not within the acceptable length range [4000000,6000000] ; N50 value is not greater than the specified minimum value [10000]

BLAST Output (out-blast/):

ls -l out-blast/
total 0

And there was no error message.

apetkau commented 1 year ago

Fixed in #160