GaoLei-bio / SV_calling

SV calling & filtering
3 stars 1 forks source link

float division by zero error #1

Open slmcevoy opened 3 years ago

slmcevoy commented 3 years ago

Hi - I would like to report and error I received while running the SV_PacBio.sh step:

file "/projects/EBP/Wegrzyn/fagr/sv/SV_calling/SV_combine_pbsv.py", line 798, in Ref_pbsv_IDs,Qry_pbsv_IDs = between_pbsv_check(cells,ref_pbsv_by_chr,query_pbsv_by_chr) File "/projects/EBP/Wegrzyn/fagr/sv/SV_calling/SV_combine_pbsv.py", line 237, in between_pbsv_checkTraceback (most recent call last): File "/projects/EBP/Wegrzyn/fagr/sv/SV_calling/SV_combine_pbsv.py", line 798, in Ref_pbsv_IDs,Qry_pbsv_IDs = between_pbsv_check(cells,ref_pbsv_by_chr,query_pbsv_by_chr) File "/projects/EBP/Wegrzyn/fagr/sv/SV_calling/SV_combine_pbsv.py", line 237, in between_pbsv_check Qry_pbsv_IDs = Repeat_sv_INS(Qry_chr,Qry_start,Qry_end,Ref_Size,Qry_Size,query_pbsv_by_chr) File "/projects/EBP/Wegrzyn/fagr/sv/SV_calling/SV_combine_pbsv.py", line 296, in Repeat_sv_INS if pbsv_size 100.0 / INS_Size < 120: ZeroDivisionError: float division by zero Traceback (most recent call last): File "/projects/EBP/Wegrzyn/fagr/sv/SV_calling/SV_combine_pbsv.py", line 798, in Ref_pbsv_IDs,Qry_pbsv_IDs = between_pbsv_check(cells,ref_pbsv_by_chr,query_pbsv_by_chr) File "/projects/EBP/Wegrzyn/fagr/sv/SV_calling/SV_combine_pbsv.py", line 237, in between_pbsv_check Qry_pbsv_IDs = Repeat_sv_INS(Qry_chr,Qry_start,Qry_end,Ref_Size,Qry_Size,query_pbsv_by_chr) File "/projects/EBP/Wegrzyn/fagr/sv/SV_calling/SV_combine_pbsv.py", line 296, in Repeat_sv_INS if pbsv_size 100.0 / INS_Size < 120: ZeroDivisionError: float division by zero

Looking through the code, I can see that this happens in the between_pbsv_check function, in the 'else' case which covers SVs identified as substitution types, and the ref and qry gaps are the same size. Here is one example from our data: fagr_allmaps_0001ch01 7145375 7145393 Assemblytics_b_64837 0 + Substitution 18 18 ch01:11086407-11086425:+ between_alignments Reverse

I was wondering if you have a recommended solution. I can add an if statement checking that INS_Size is not 0 before running if pbsv_size * 100.0 / INS_Size < 120: but then those are excluded.

Thank you!

GaoLei-bio commented 3 years ago

Hi, Thanks for bringing this issue up.

I agree adding an if statement is a provisional solution for this problem. Also, I think you may want to try adding an if statement at line 288 to avoid INS_Size == 0. For the Substitution with same ref and qry gaps, we can take the ref or qry gap size as INS_Size.

I’m sorry for the inconvenience. Let me know if you have any other questions.

Best wishes, Lei