zjshi / Maast

Microbial agile accurate SNP Typer
MIT License
29 stars 2 forks source link

Division by zero error #6

Closed snayfach closed 2 years ago

snayfach commented 2 years ago

Cmd: maast end_to_end --in-dir fna/100 --out-dir maast/100 --threads 64

Error:

reference genome path: fna/100/GCF_000797085.1.fna

b'' b"mash: Relink /global/u1/s/snayfach/.conda/envs/maast/bin/../lib/./libgfortran.so.5' with/lib64/librt.so.1' for IFUNC symbol clock_gettime'\nSketching fna/100/GCF_900145835.1.fna...\nSketching fna/100/GCF_003834195.1.fna...\nSketching fna/100/GCF_003974345.1.fna...\nSketching fna/100/GCF_013169255.1.fna...\nSketching fna/100/GCF_000795535.1.fna...\nSketching fna/100/GCF_000480555.1.fna...\nSketching fna/100/GCF_000481005.1.fna...\nSketching fna/100/GCF_002330005.1.fna...\nSketching fna/100/GCF_002135995.1.fna...\nSketching fna/100/GCF_000629445.1.fna...\nSketching fna/100/GCF_001836415.1.fna...\nSketching fna/100/GCF_001181325.1.fna...\nSketching fna/100/GCF_001035645.1.fna...\nSketching fna/100/GCF_003936385.1.fna...\nSketching fna/100/GCF_900147275.1.fna...\nSketching fna/100/GCF_003631075.1.fna...\nSketching fna/100/GCF_004373425.1.fna...\nSketching fna/100/GCF_004372645.1.fna...\nSketching fna/100/GCF_001373655.1.fna...\nSketching fna/100/GCF_001063435.1.fna...\nSketching fna/100/GCF_000506805.1.fna...\nSketching fna/100/GCF_003836165.1.fna...\nSketching fna/100/GCF_900455405.1.fna...\nSketching fna/100/GCF_000793365.1.fna...\nSketching fna/100/GCF_003834585.1.fna...\nSketching fna/100/GCF_001023775.1.fna...\nSketching fna/100/GCF_013169305.1.fna...\nSketching fna/100/GCF_900637045.1.fna...\nSketching fna/100/GCF_013625685.1.fna...\nSketching fna/100/GCF_001451125.1.fna...\nSketching fna/100/GCF_013178575.1.fna...\nSketching fna/100/GCF_002201295.1.fna...\nSketching fna/100/GCF_003412155.1.fna...\nSketching fna/100/GCF_003631845.1.fna...\nSketching fna/100/GCF_006704975.1.fna...\nSketching fna/100/GCF_000817865.1.fna...\nSketching fna/100/GCF_001414105.1.fna...\nSketching fna/100/GCF_006704475.1.fna...\nSketching fna/100/GCF_003835445.1.fna...\nSketching fna/100/GCF_001034855.1.fna...\nSketching fna/100/GCF_000795085.1.fna...\nSketching fna/100/GCF_000348745.1.fna...\nSketching fna/100/GCF_014204425.1.fna...\nSketching fna/100/GCF_000791705.1.fna...\nSketching fna/100/GCF_000796125.1.fna...\nSketching fna/100/GCA_009995625.1.fna...\nSketching fna/100/GCF_000481925.1.fna...\nSketching fna/100/GCF_900147865.1.fna...\nSketching fna/100/GCF_900144525.1.fna...\nSketching fna/100/GCF_003957095.1.fna...\nSketching fna/100/GCF_001037165.1.fna...\nSketching fna/100/GCF_001451255.1.fna...\nSketching fna/100/GCF_000795945.1.fna...\nSketching fna/100/GCF_000796365.1.fna...\nSketching fna/100/GCA_009996055.1.fna...\nSketching fna/100/GCF_003840555.1.fna...\nSketching fna/100/GCF_900147075.1.fna...\nSketching fna/100/GCF_003973815.1.fna...\nSketching fna/100/GCF_900144735.1.fna...\nSketching fna/100/GCF_000793045.1.fna...\nSketching fna/100/GCF_000790685.1.fna...\nSketching fna/100/GCF_900707695.1.fna...\nSketching fna/100/GCF_001554935.1.fna...\nSketching fna/100/GCF_001375475.1.fna...\nSketching fna/100/GCF_001451975.1.fna...\nSketching fna/100/GCF_002136005.1.fna...\nSketching fna/100/GCF_002135795.1.fna...\nSketching fna/100/GCF_001373875.1.fna...\nSketching fna/100/GCF_900707045.1.fna...\nSketching fna/100/GCF_003798145.1.fna...\nSketching fna/100/GCF_001837545.1.fna...\nSketching fna/100/GCF_002406265.1.fna...\nSketching fna/100/GCF_003833395.1.fna...\nSketching fna/100/GCF_000796505.1.fna...\nSketching fna/100/GCF_013620385.1.fna...\nSketching fna/100/GCF_004372905.1.fna...\nSketching fna/100/GCF_004371145.1.fna...\nSketching fna/100/GCF_001035945.1.fna...\nSketching fna/100/GCF_002285355.1.fna...\nSketching fna/100/GCF_009299585.1.fna...\nSketching fna/100/GCF_003973785.1.fna...\nSketching fna/100/GCF_002330595.1.fna...\nSketching fna/100/GCF_004370835.1.fna...\nSketching fna/100/GCF_003834885.1.fna...\nSketching fna/100/GCF_002330275.1.fna...\nSketching fna/100/GCF_000796825.1.fna...\nSketching fna/100/GCF_000791605.1.fna...\nSketching fna/100/GCF_001037055.1.fna...\nSketching fna/100/GCF_001554625.1.fna...\nSketching fna/100/GCF_003321505.1.fna...\nSketching fna/100/GCF_003975435.1.fna...\nSketching fna/100/GCF_003410695.2.fna...\nSketching fna/100/GCF_000797085.1.fna...\nSketching fna/100/GCF_004375535.1.fna...\nSketching fna/100/GCF_001181525.1.fna...\nSketching fna/100/GCF_008579445.1.fna...\nSketching fna/100/GCF_000520175.1.fna...\nSketching fna/100/GCF_003969635.1.fna...\nSketching fna/100/GCF_003835665.1.fna...\nSketching fna/100/GCF_003837285.1.fna...\nSketching fna/100/GCF_004351715.1.fna...\nWriting to maast/100/temp/mash/100/mash_sketch.msh...\n"[calculating mash distance]: start b'' b"mash: Relink/global/u1/s/snayfach/.conda/envs/maast/bin/../lib/./libgfortran.so.5' with /lib64/librt.so.1' for IFUNC symbolclock_gettime'\n"[cut mash distance: 0.05]: start b'' b''[clustering] start [clustering] done [clustering] 96 genomes have been included in clusters [Searching lower cap] [clustering] start [clustering] done [clustering] 33 genomes have been included in clusters 0.002: 77 tag genomes [clustering] start [clustering] done [clustering] 12 genomes have been included in clusters 0.0002: 93 tag genomes [clustering] start [clustering] done [clustering] 2 genomes have been included in clusters 2e-05: 100 tag genomes [End earching] [Searching optimal d-cut] [Searching optimal d-cut] [clustering] start [clustering] done fna/100/GCF_001554935.1.fna Running mummer4; start reference genome path: fna/100/GCF_001554935.1.fna

GCF_001554935.1 - GCF_900145835.1 GCF_001554935.1 - GCF_003834195.1 GCF_001554935.1 - GCF_003974345.1 GCF_001554935.1 - GCF_013169255.1 GCF_001554935.1 - GCF_000795535.1 GCF_001554935.1 - GCF_000480555.1 GCF_001554935.1 - GCF_000481005.1 GCF_001554935.1 - GCF_002330005.1 GCF_001554935.1 - GCF_002135995.1 GCF_001554935.1 - GCF_000629445.1 GCF_001554935.1 - GCF_001836415.1 GCF_001554935.1 - GCF_001181325.1 GCF_001554935.1 - GCF_001035645.1 GCF_001554935.1 - GCF_003936385.1 GCF_001554935.1 - GCF_900147275.1 GCF_001554935.1 - GCF_003631075.1 GCF_001554935.1 - GCF_004373425.1 GCF_001554935.1 - GCF_004372645.1 GCF_001554935.1 - GCF_001373655.1 GCF_001554935.1 - GCF_001063435.1 GCF_001554935.1 - GCF_000506805.1 GCF_001554935.1 - GCF_003836165.1 GCF_001554935.1 - GCF_900455405.1 GCF_001554935.1 - GCF_000793365.1 GCF_001554935.1 - GCF_003834585.1 GCF_001554935.1 - GCF_001023775.1 GCF_001554935.1 - GCF_013169305.1 GCF_001554935.1 - GCF_900637045.1 GCF_001554935.1 - GCF_013625685.1 GCF_001554935.1 - GCF_001451125.1 GCF_001554935.1 - GCF_013178575.1 GCF_001554935.1 - GCF_002201295.1 GCF_001554935.1 - GCF_003412155.1 GCF_001554935.1 - GCF_003631845.1 GCF_001554935.1 - GCF_006704975.1 GCF_001554935.1 - GCF_000817865.1 GCF_001554935.1 - GCF_001414105.1 GCF_001554935.1 - GCF_006704475.1 GCF_001554935.1 - GCF_003835445.1 GCF_001554935.1 - GCF_001034855.1 GCF_001554935.1 - GCF_000795085.1 GCF_001554935.1 - GCF_000348745.1 GCF_001554935.1 - GCF_014204425.1 GCF_001554935.1 - GCF_000791705.1 GCF_001554935.1 - GCF_000796125.1 GCF_001554935.1 - GCA_009995625.1 GCF_001554935.1 - GCF_000481925.1 GCF_001554935.1 - GCF_900147865.1 GCF_001554935.1 - GCF_900144525.1 GCF_001554935.1 - GCF_003957095.1 GCF_001554935.1 - GCF_001037165.1 GCF_001554935.1 - GCF_001451255.1 GCF_001554935.1 - GCF_000795945.1 GCF_001554935.1 - GCF_000796365.1 GCF_001554935.1 - GCA_009996055.1 GCF_001554935.1 - GCF_003840555.1 GCF_001554935.1 - GCF_900147075.1 GCF_001554935.1 - GCF_003973815.1 GCF_001554935.1 - GCF_900144735.1 GCF_001554935.1 - GCF_000793045.1 GCF_001554935.1 - GCF_000790685.1 GCF_001554935.1 - GCF_900707695.1 GCF_001554935.1 - GCF_001554935.1 GCF_001554935.1 - GCF_001375475.1 GCF_001554935.1 - GCF_001451975.1 GCF_001554935.1 - GCF_002136005.1 GCF_001554935.1 - GCF_002135795.1 GCF_001554935.1 - GCF_001373875.1 GCF_001554935.1 - GCF_900707045.1 GCF_001554935.1 - GCF_003798145.1 GCF_001554935.1 - GCF_001837545.1 GCF_001554935.1 - GCF_002406265.1 GCF_001554935.1 - GCF_003833395.1 GCF_001554935.1 - GCF_000796505.1 GCF_001554935.1 - GCF_013620385.1 GCF_001554935.1 - GCF_004372905.1 GCF_001554935.1 - GCF_004371145.1 GCF_001554935.1 - GCF_001035945.1 GCF_001554935.1 - GCF_002285355.1 GCF_001554935.1 - GCF_009299585.1 GCF_001554935.1 - GCF_003973785.1 GCF_001554935.1 - GCF_002330595.1 GCF_001554935.1 - GCF_004370835.1 GCF_001554935.1 - GCF_003834885.1 GCF_001554935.1 - GCF_002330275.1 GCF_001554935.1 - GCF_000796825.1 GCF_001554935.1 - GCF_000791605.1 GCF_001554935.1 - GCF_001037055.1 GCF_001554935.1 - GCF_001554625.1 GCF_001554935.1 - GCF_003321505.1 GCF_001554935.1 - GCF_003975435.1 GCF_001554935.1 - GCF_003410695.2 GCF_001554935.1 - GCF_000797085.1 GCF_001554935.1 - GCF_004375535.1 GCF_001554935.1 - GCF_001181525.1 GCF_001554935.1 - GCF_008579445.1 GCF_001554935.1 - GCF_000520175.1 GCF_001554935.1 - GCF_003969635.1 GCF_001554935.1 - GCF_003835665.1 GCF_001554935.1 - GCF_003837285.1 GCF_001554935.1 - GCF_004351715.1 multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/global/homes/s/snayfach/.conda/envs/maast/lib/python3.6/multiprocessing/pool.py", line 119, in worker result = (True, func(*args, **kwds)) File "/global/u1/s/snayfach/.conda/envs/maast/bin/bin/Maast.py", line 438, in run_mummer4_single min_pid_by_delta = auto_min_pid_by_delta(coords_path, max_pid_delta) File "/global/u1/s/snayfach/.conda/envs/maast/bin/bin/Maast.py", line 397, in auto_min_pid_by_delta avg_pid = sum(pids)/len(pids) ZeroDivisionError: division by zero """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/global/u1/s/snayfach/.conda/envs/maast/bin/bin/Maast.py", line 1320, in main() File "/global/u1/s/snayfach/.conda/envs/maast/bin/bin/Maast.py", line 1315, in main end2end_main(args) File "/global/u1/s/snayfach/.conda/envs/maast/bin/bin/Maast.py", line 1282, in end2end_main call_snps_main(args) File "/global/u1/s/snayfach/.conda/envs/maast/bin/bin/Maast.py", line 1151, in call_snps_main run_mummer4(args) File "/global/u1/s/snayfach/.conda/envs/maast/bin/bin/Maast.py", line 487, in run_mummer4 parallel(run_mummer4_single, arg_list, args['threads']) File "/global/u1/s/snayfach/.conda/envs/maast/bin/bin/Maast.py", line 259, in parallel return [r.get() for r in results] File "/global/u1/s/snayfach/.conda/envs/maast/bin/bin/Maast.py", line 259, in return [r.get() for r in results] File "/global/homes/s/snayfach/.conda/envs/maast/lib/python3.6/multiprocessing/pool.py", line 644, in get raise self._value ZeroDivisionError: division by zero

snayfach commented 2 years ago

I suspect the issue is because Mummer4 is running into errors. Looking in the output maast/100/temp/mummer4/100/aln/ I see that the diff file is missing for 3 of my 101 genomes, but the log file doesn't contain anything useful.

snayfach commented 2 years ago

OK I found the issue. Turns out that my test input genomes were not all from the same species. This is something that the program should be able to detect based on the all-vs-all Mash. For example, by removing genomes with a mash distance of >0.15 versus the reference used for SNP calling (can be a command line option). In my case the mash distance was 1.0 because the genome was so different from the rest.

zjshi commented 2 years ago

Stephen, thanks for both reporting and tracing the problem! I pushed a fix to address this issue but without testing. Please reopen the issue if the problem still there.