ZeroDivisionError: float division by zero #428

Open ivaneskos opened 1 month ago

ivaneskos commented 1 month ago


ZeroDivisionError: float division by zero


I am trying to use average_nucleotide_identity.py script for the dataset of 2000 short (~44kb) genomes, my command was: average_nucleotide_identity.py -i $all_samples_fasta -o $ani_analysis -m ANIb -f -g -l $log_file -v --workers 160 --seed 22 --gformat png,pdf I have the following error (end of the log file, all lines before - just the blast commands):

INFO: Command pool done. WARNING: At least one BLAST run failed. ANIb may fail. INFO: Processing pairwise ANIb BLAST output. ERROR: One or more BLAST output files has a problem. ERROR: This is possibly due to BLASTN run failure, please investigate ERROR: Traceback (most recent call last): File "/home/ivan/anaconda3/envs/pyani_env/bin/average_nucleotide_identity.py", line 727, in unified_anib data = anib.process_blast( ^^^^^^^^^^^^^^^^^^^ File "/home/ivan/anaconda3/envs/pyani_env/lib/python3.12/site-packages/pyani/anib.py", line 444, in process_blast query_cover = float(resultvals[0]) / org_lengths[qname]

ZeroDivisionError: float division by zero

INFO: Compressing/deleting /legserv/Temp/Ivan/Zymo_2000_genomes/ani_files/ANIb_output_500_attempt/blastn_output
INFO:   Compressing output from /legserv/Temp/Ivan/Zymo_2000_genomes/ani_files/ANIb_output_500_attempt/blastn_output to /legserv/Temp/Ivan/Zymo_2000_genomes/ani_files/ANIb_output_500_attempt/blastn_output.tar.gz
INFO:   Removing output directory /legserv/Temp/Ivan/Zymo_2000_genomes/ani_files/ANIb_output_500_attempt/blastn_output

I didn't have this error on the subset of 100 genomes, I have the same error in the bigger subsets (I've tested 500 and 1000 genomes)

#### pyani Version:
pyani 0.2.12

#### Python Version:
Python 3.8.19

#### Operating System:
Ubuntu 20.04.6 LTS
kiepczi commented 1 month ago

Hi @ivaneskos,

Thank you for your interest in pyANI.

Unfortunately, from the error message alone, I am unable to provide a definite answer. Could you please provide me with a small dataset that produces this error? This would help me investigate the issue in more detail.

In the meantime, I can think of a few reasons why the analysis might be resulting in this error:



ivaneskos commented 1 month ago

Hi @kiepczi! Thank you for your immediate answer! Here is the smallest subset where I had an error (500 45kb genomes). Thank you a lot for helping me!! 500_genomes.tar.gz

widdowquinn commented 3 weeks ago

Hi @ivaneskos - many thanks for the example data, but is it possible maybe to reduce the number of genomes you're sending, and provide a minimal example that replicates the issue?

Many thanks,
