gem-pasteur / Integron_Finder

Bioinformatics tool to find integrons in bacterial genomes
GNU General Public License v3.0
67 stars 22 forks source link

IndexError attc.py #71

Closed jfourquet2 closed 3 years ago

jfourquet2 commented 5 years ago

Version of Integron_Finder:

integron_finder version 2-2019-11-15 Using:

Dependencies: infernal-1.1.2, hmmer-3.2.1, Prodigal-2.6.3

OS

Linux

Expected behavior

I wanted to run integron_finder on 4 samples. It worked well for 3 samples but it didn't work for 1 sample. I extract the contig where there was the error and run integron_finder again only on this contig. The same error occurred (see logs at the end of the issue). My command line is :

integron_finder --local-max --keep-tmp --union-integrases Sample_b_contig_b_16693.fasta 
--outdir Sample_b_contig_b_16693_v2_union_int --pdf

Actual behavior

Doesn't work for 1 sample.

Steps to reproduce behavior

Run with the fasta file Sample_b_contig_b_16693.txt the command line:

integron_finder --local-max --keep-tmp --union-integrases Sample_b_contig_b_16693.txt 
--outdir Sample_b_contig_b_16693_v2_union_int --pdf

Relevant logs and/or screenshots

command used: integron_finder --local-max --keep-tmp --union-integrases Sample_b_contig_b_16693.fasta --outdir Sample_b_contig_b_16693_v2_union_int --pdf
                     =======================
INFO     :  ############ Processing replicon b_16693 (1/1) ############

INFO     :  Starting Default search ... :
INFO     :  Default search done... : 
INFO     :  In replicon b_16693, there are:
INFO     :  - 1 complete integron(s) found with a total 1 attC site(s)
INFO     :  - 0 CALIN element(s) found with a total of 0 attC site(s)
INFO     :  - 0 In0 element(s) found with a total of 0 attC site
INFO     :  Starting search with local_max...:
Traceback (most recent call last):
  File "/work/project/antiselfish/integronFinder/Integron_Finder/bin/integron_finder", line 11, in <module>
    load_entry_point('integron-finder===2-2019-11-13', 'console_scripts', 'integron_finder')()
  File "/work/project/antiselfish/integronFinder/Integron_Finder/lib/python3.6/site-packages/integron_finder/scripts/finder.py", line 601, in main
    integron_res, summary = find_integron_in_one_replicon(replicon, config)
  File "/work/project/antiselfish/integronFinder/Integron_Finder/lib/python3.6/site-packages/integron_finder/scripts/finder.py", line 339, in find_integron_in_one_replicon
    evalue_attc=config.evalue_attc)
  File "/work/project/antiselfish/integronFinder/Integron_Finder/lib/python3.6/site-packages/integron_finder/attc.py", line 213, in find_attc_max
    go_left = (full_element[full_element.type_elt == "attC"].pos_beg.values[0] - df_max.pos_end.values[0]
IndexError: index 0 is out of bounds for axis 0 with size 0

Thanks a lot for your help ! Sample_b_contig_b_16693.txt

jeanrjc commented 5 years ago

It is because after searching with local_max, the e-value of previously found attC went above the threshold. If you add --evalue-attc 5, it will run ok. Not sure why the evalue is different though.

That being said, we should make sure that attC sites found in the first pass should appear at the end, even if they are not found anymore afterward.

jfourquet2 commented 5 years ago

Thanks, I will thy with --evalue-attc 5 !

jeanrjc commented 5 years ago

Note that setting the evalue above 1 has very little effect on sensitivity and false positive rate: https://integronfinder.readthedocs.io/en/latest/user_guide/tutorial.html#attc-evalue

jeanrjc commented 3 years ago

closing in favor of #84