gem-pasteur / Integron_Finder

Bioinformatics tool to find integrons in bacterial genomes
GNU General Public License v3.0
64 stars 22 forks source link

[BUG] Error in find_attc_max function #117

Closed Ales-ibt closed 3 months ago

Ales-ibt commented 4 months ago

Hello there,

Thank you for developing IntegronFinder.

I am updating to version 2.0.3 to check if some issues related to negative coordinates and other bugs we have been experimenting with are solved. Unfortunately, we noted that negative coordinates are still reported (contig_116 in the attached example.fasta.gz). However, what I'd like to report here is an error we get with contig_11 in the same attached fasta file. The error with version 2.0.2 is the following:

Traceback (most recent call last):
  File "/opt/miniconda/bin/integron_finder", line 8, in <module>
    sys.exit(main())
  File "/opt/miniconda/lib/python3.9/site-packages/integron_finder/scripts/finder.py", line 648, in main
    integron_res, summary = find_integron_in_one_replicon(replicon, config)
  File "/opt/miniconda/lib/python3.9/site-packages/integron_finder/scripts/finder.py", line 385, in find_integron_in_one_replicon
    integrons = find_integron(replicon, protein_db, integron_max, intI_file, phageI_file, config)
  File "/opt/miniconda/lib/python3.9/site-packages/integron_finder/integron.py", line 150, in find_integron
    attc_left = np.array([i_attc.pos_beg.values[0] for i_attc in attc_ac])
  File "/opt/miniconda/lib/python3.9/site-packages/integron_finder/integron.py", line 150, in <listcomp>
    attc_left = np.array([i_attc.pos_beg.values[0] for i_attc in attc_ac])
IndexError: index 0 is out of bounds for axis 0 with size 0

And in version 2.0.3 it becomes into:

Traceback (most recent call last):
  File "/usr/local/bin/integron_finder", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.12/site-packages/integron_finder/scripts/finder.py", line 651, in main
    integron_res, summary = find_integron_in_one_replicon(replicon, config)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/integron_finder/scripts/finder.py", line 366, in find_integron_in_one_replicon
    integron_max = find_attc_max(integrons, replicon, config.distance_threshold,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/integron_finder/attc.py", line 394, in find_attc_max
    to_concat = [df for df in [max_elt, df_max] if not df.empty]
                                        ^^^^^^

This is the command I am running in Linux:

$ singularity exec gempasteur-integron_finder:2.0.3.sif integron_finder --union-integrases --local-max --cpu 8 --gbk miniexample.fasta

Integron_Finder Version: integron_finder version 2.0.3 Using:

Thanks in advance!

jeanrjc commented 4 months ago

Hello, can you check whether the branch fix_find_attc_max solves your pb ?

Ales-ibt commented 4 months ago

Hello!

Thank you for your quick response.

Using the branch fix_find_attc_max the pipeline is running Ok with contig_11, but still reporting negative coordinates for contig_116.

KateSakharova commented 4 months ago

@jeanrjc Could you merge that fix into master and include to new release, please? We (MGnify) are using you tool in our pipeline that @Ales-ibt is supporting. We had to build a container from that branch and implement that fix into pipeline. It would be super cool if we can replace it with proper working release.

Thank you! Best, Kate MGnify developer

Ales-ibt commented 4 months ago

In addition, we are currently ignoring predictions with negative coordinates. It would be great to have that fix as well in the new release :D

bneron commented 3 months ago

fixed in 2.0.5 version