Error in DIAMOND_analysis_counter.py

transcript / samsa2

SAMSA pipeline, version 2.0. An open-source metatranscriptomics pipeline for analyzing microbiome data, built around DIAMOND and customizable reference databases.

GNU General Public License v3.0

56 stars 36 forks source link

Hello, I am having issues with DIAMOND_analysis_counter.py script I am getting a similar error as in this previous post https://github.com/transcript/samsa2/issues/57

command: python Diamond_analysis_counter2.py -I BMRNA2_other_nr.daa_viewable -D /media/scratch/2022_diamond_nr_db/nr -O BMRNA2_other_nr_organism

error: Now reading through the m8 results infile.

Analysis of BMRNA2_other_nr.daa_viewable complete. Number of total lines: 426637 Number of unique sequences: 422738 Time elapsed: 0.5995767116546631 seconds.

Starting database analysis now. Traceback (most recent call last): File "Diamond_analysis_counter2.py", line 151, in if split_db_org[1] == "sp.": IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "Diamond_analysis_counter2.py", line 157, in db_org = split_db_org[1] IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "Diamond_analysis_counter2.py", line 162, in db_org = split_db_org[1] + " " + split_db_org[2] IndexError: list index out of range

From post linked above: "the parsing script doesn't do well when there are multiple instances of square brackets in the line."

When I go in and look at the line (151) all I see is the string of AA's: TREFEAFEAGRRYANTAYLVDLQEMQGDNLLRELVRITAQMNWQLNDLKEQIRQGNVISGQQLALTARQYYEKQLGSLEK

if db_org[0].isdigit(): split_db_org = db_org.split() try: db_org = split_db_org[1] + " " + split_db_org[2] except IndexError: print(line) print(str(db_line_counter))

transcript / samsa2

Error in DIAMOND_analysis_counter.py #74