Joseph7e / Assign-Taxonomy-with-BLAST

Assign taxonomy with blast, can be used for qiime
20 stars 8 forks source link

error with length percentage cutoff using 0.7 and higher #3

Open bioinfo17 opened 4 years ago

bioinfo17 commented 4 years ago

Hello,

The taxonomy_assignment_BLAST_V2.py script works great for taxonomic assignment using default parameters. However when I use it with the filtering option "length_percentage" with values equal to or > than 0.7 I get this error:

Traceback (most recent call last): File "taxonomy_assignment_BLAST_V2.py", line 363, in best_level_taxonomy, blast_percent = Assign_Taxonomy(current_query, current_best_hits) File "taxonomy_assignment_BLAST_V2.py", line 248, in Assign_Taxonomy if max(top_hits) >= args.cutoff_family: ValueError: max() arg is an empty sequence

For some reason, it works well with values < 0.7.

Once again, thanks heaps for your time in advance.

AxenArk commented 3 years ago

I got this error too. I found if all blast hits (if any) for the last sequence in your fasta file fail to pass all the cutoffs, the "top_hits" will be empty and this error will pop out.

Here is how I fix this: replace lines 351~357 of taxonomy_assignment_BLAST.py (as for 2021 May 22) with the following

if bool(current_best_hits) and current_query: #ENSURE A SET OF BLASTS TO PARSE
    best_level_taxonomy, blast_percent = Assign_Taxonomy(current_query, current_best_hits)

    # ### FINALLY FILL Sequence INFORMATION!!!!
    total_sequences_assigned += 1
    sequence_taxonomy_dict[current_query] = best_level_taxonomy
    percent_id_dict[current_query] = blast_percent
    log_and_print('Taxonomy Assignment for ' +  current_query +  ' = ' +  ':'.join(best_level_taxonomy)+'\n\n\n######')

I think line 351 of "taxonomy_assignment_BLAST.py" is corresponding to line 363 in your version of "taxonomy_assignment_BLAST_V2.py".