eead-csic-compbio / get_homologues

GET_HOMOLOGUES: a versatile software package for pan-genome analysis
Other
104 stars 25 forks source link

# parsing blast result! Illegal division by zero at /lib/marfil_homology.pm line 1041, <BLASTOUT> line 2. #15

Closed fhsantanna closed 6 years ago

fhsantanna commented 7 years ago

I am trying do make a pangenome matrix. The command " ./get_homologues.pl -d genomas_rev19072017 -t 0 -M" gives the following log:

# ./get_homologues.pl -i 0 -d genomas_rev19072017 -o 0 -e 0 -f 0 -r 0 -t 0 -c 0 -z 0 -I 0 -m local -n 2 -M 1 -G 0 -P 0 -C 75 -S 1 -E 1e-05 -F 1.5 -N 0 -B 50 -b 0 -s 0 -D 0 -g 0 -a '0' -x 0 -R 0 -A 0

# version 07112016 # results_directory=/opt/get_homologues-x86_64-20161107/genomas_rev19072017_homologues # parameters: MAXEVALUEBLASTSEARCH=0.01 MAXPFAMSEQS=250 BATCHSIZE=100 KEEPSCNDHSPS=1

# checking input files... # P_HW567.gbk 5363 # P_graminis_DSM_15220.gbk 5584 # P_jilunlii.gbk 5766 # P_polymyxa_ATCC_842.gbk 5068 # P_riograndensis_CAR114.gbk 6098 # P_riograndensis_CAS34.gbk 5969 # P_riograndensis_LN831776.gbk 6705 # P_sonchi_X19-5.gbk 7117 # Paenibacillus_borealis.gbk 6698 # Paenibacillus_durus.gbk 5140 # Paenibacillus_durus_ATCC_35681.gbk 4992 # Paenibacillus_forsythiae_T98.gbk 4193 # Paenibacillus_odorifer.gbk 5752 # Paenibacillus_sabinae_T27.gbk 4634 # Paenibacillus_stellifer.gbk 4899 # Paenibacillus_wynnii.gbk 5282 # Paenibacillus_zanthoxyli_JH29.gbk 4372

# 17 genomes, 93632 sequences

# taxa considered = 17 sequences = 93632 residues = 29545143 MIN_BITSCORE_SIM = 19.9

# mask=PaenibacillusforsythiaeT98_f0_0taxa_algOMCLe0 (_algOMCL)

# skipped genome parsing (genomas_rev19072017_homologues/tmp/selected.genomes)

# running BLAST searches ... # done

# parsing blast result! (/opt/get_homologues-x86_64-20161107/genomas_rev19072017_homologues/tmp/all.blast , 1.1e+03MB) Illegal division by zero at /opt/get_homologues-x86_64-20161107/lib/marfil_homology.pm line 1041, line 2.

Any idea what is going on? The sample files (Buchnera) ran fine. Version: 20161107

eead-csic-compbio commented 7 years ago

Hi, can you please post the first few lines of file /opt/get_homologues-x86_64-20161107/genomas_rev19072017_homologues/tmp/all.blast ?

eead-csic-compbio commented 1 year ago

Will look into this at the end of the week El 7 jul. 2023 21:59, CarolinaNolasco @.***> escribió: Hi, I've the same issue. The command sudo perl /home/carlos/Descargas/get_homologues-x86_64-20230515/get_homologues.pl -n 12 -d $gbk_dir generates the log: /home/carlos/Descargas/get_homologues-x86_64-20230515/get_homologues.pl -i 0 -d /home/carlos/GBK/GET-T2/GBK_Select -o 0 -X 0 -e 0 -f 0 -r 0 -t all -c 0 -z 0 -I 0 -m local -n 12 -M 0 -G 0 -p 0 -C 75 -S 1 -E 1e-05 -F 1.5 -N 0 -B 50 -b 0 -s 0 -D 0 -g 0 -a '0' -x 0 -R 0 -A 0 -P 0 version 15052022 results_directory=/mnt/9a2c68b8-6327-41bd-9488-090bee28e30f/GBK_Select_homologues parameters: MAXEVALUEBLASTSEARCH=0.01 MAXPFAMSEQS=250 BATCHSIZE=100 KEEPSCNDHSPS=1 diamond job:0 checking input files... HIMFG1.gbk 6453 HIMFG2.gbk 6533 HIMFG3.gbk 6292 HIMFG4.gbk 6459 HIMFG5.gbk 6395 HIMFG6.gbk 6369 HIMFG7.gbk 6657 HIMFG8.gbk 5995 PA_ST111_0504_S17.gbk 6529 PA_ST111_095M0019.gbk 6259 PA_ST111_1.gbk 6691 PA_ST111_235M0162.gbk 6223 PA_ST111_34Pae36.gbk 7120 PA_ST111_3796_S15.gbk 6551 PA_ST111_A-I-1.gbk 6521 PA_ST111_AG1.gbk 6621 PA_ST111_AR445.gbk 6562 PA_ST111_AR_0241.gbk 6642 PA_ST111_AZPAE14886.gbk 6507 PA_ST111_AZPAE15002.gbk 6242 ........ 201 genomes, 1278496 sequences taxa considered = 201 sequences = 1278496 residues = 408042112 MIN_BITSCORE_SIM = 21.8 mask=PAST111UMB0740_f0_alltaxa_algBDBHe0 (_algBDBH) skipped genome parsing (GBK_Select_homologues/tmp/selected.genomes) running BLAST searches ... done concatenating and sorting BLAST/DIAMOND results... sorting _HIMFG1.gbk results (5.8e+02MB) sorting _HIMFG2.gbk results (5.8e+02MB) sorting _HIMFG3.gbk results (6e+02MB) sorting _HIMFG4.gbk results (6e+02MB) ............. parsing blast result! (/mnt/9a2c68b8-6327-41bd-9488-090bee28e30f/GBK_Select_homologues/tmp/all.blast , 1.2e+05MB) Illegal division by zero at /home/carlos/Descargas/get_homologues-x86_64-20230515/lib/marfil_homology.pm line 1122, line 908871177. And the first 10 lines of the file /mnt/9a2c68b8-6327-41bd-9488-090bee28e30f/GBK_Select_homologues/tmp/all.blast 👍 1 1 100.000 174 174 174 1 174 1 174 3.76e-121 341 1 957974 98.810 168 174 176 1 168 1 168 2.61e-112 318 1 677407 99.408 169 174 1246 1 169 1 169 1.10e-103 325 1 6413 99.408 169 174 1280 1 169 1 169 2.06e-103 324 1 8369 99.408 169 174 1280 1 169 1 169 2.06e-103 324 1 682435 99.408 169 174 1280 1 169 1 169 2.06e-103 324 1 705337 99.408 169 174 1280 1 169 1 169 2.06e-103 324 1 733986 99.408 169 174 1280 1 169 1 169 2.06e-103 324 1 751292 99.408 169 174 1280 1 169 1 169 2.06e-103 324 1 819317 99.408 169 174 1280 1 169 1 169 2.06e-103 324 What can i do? :(

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you modified the open/close state.Message ID: @.***>

eead-csic-compbio commented 1 year ago

Hi @CarolinaNolasco, indeed this seems to be your case as well.

To be sure we would need to see line 908871177 of file /mnt/9a2c68b8-6327-41bd-9488-090bee28e30f/GBK_Select_homologues/tmp/all.blast , but my guess is that a BLASTN job did not complete successfully and when concatenated to the other BLASTN result files caused this error.

In order to fix it you would need to remove the culprit file, which can be identified from line 908871177 and re-run. This way all the sane files should be re-used, hope this helps, Bruno

eead-csic-compbio commented 1 year ago

If you can still find that file please share that part, it might be useful to patch the code