Microbial-Ecology-Group / AMRplusplus

AMR++ is a bioinformatic pipeline meant to aid in the analysis of raw sequencing reads to characterize the profile of antimicrobial resistance genes, or resistome.
GNU General Public License v3.0
28 stars 12 forks source link

TypeError with certain samples #34

Open nsharp2 opened 9 months ago

nsharp2 commented 9 months ago


I have been encountering an issue recently running AMR++ SNP verification with metagenomic data. A sample stopped execution at the runsnp process with the following message:

Command error:
  Traceback (most recent call last):
    File "/usr/local/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
      r = call_item.fn(*call_item.args, **call_item.kwargs)
    File "/usr/local/lib/python3.8/concurrent/futures/process.py", line 198, in _process_chunk
      return [fn(*args) for args in chunk]
    File "/usr/local/lib/python3.8/concurrent/futures/process.py", line 198, in <listcomp>
      return [fn(*args) for args in chunk]
    File "/opt/analysis/nextflow-bin/SNP_Verification.py", line 229, in iterate
      verify(read, gene_variant, config)
    File "/opt/analysis/nextflow-bin/SNP_Verification_Processes/__init__.py", line 49, in verify
      nTupleCheck(read, gene, mapOfInterest, seqOfInterest, config)
    File "/opt/analysis/nextflow-bin/SNP_Verification_Processes/nTupleCheck.py", line 171, in nTupleCheck
      if mt == seqOfInterest[queryIndex]:
  TypeError: string indices must be integers
  The above exception was the direct cause of the following exception:
  Traceback (most recent call last):
    File "/opt/analysis/nextflow-bin/SNP_Verification.py", line 353, in <module>
    File "/opt/analysis/nextflow-bin/SNP_Verification.py", line 348, in main
      [gene_variant_dict.update(r) for r in results]
    File "/opt/analysis/nextflow-bin/SNP_Verification.py", line 348, in <listcomp>
      [gene_variant_dict.update(r) for r in results]
    File "/usr/local/lib/python3.8/concurrent/futures/process.py", line 484, in _chain_from_iterable_of_lists
      for element in iterable:
    File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 619, in result_iterator
      yield fs.pop().result()
    File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 437, in result
      return self.__get_result()
    File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
      raise self._exception
  TypeError: string indices must be integers

Upon closer inspection of SNP_Verification.py, I have found that this particular error seems to occur when queryIndex at this line is set to None. I set some print statements for mapofInterest, seqOfInterest, and queryIndex in the loop this line is included in and took a screenshot of the output. I found that this loop is looped for a large number of times (over 265 times) for a specific sequence, before the value 11 for key 2251 in mapOfInterest below is set to None:


I am not quite sure why this behavior is happening, and would greatly appreciate any help with the matter.

Thank you!