gem-pasteur / Integron_Finder

Bioinformatics tool to find integrons in bacterial genomes
GNU General Public License v3.0
67 stars 22 forks source link

[HELP] Question: are gene cassettes of a integron all on the same strand? #102

Closed jiarong closed 2 years ago

jiarong commented 2 years ago

Hi, I noticed there are some gene cassettes of the same integron (complete) on different strands in Integron_Finder (v2.0.2) results. This should not happen, right? Thanks!

jeanrjc commented 2 years ago

Hello, Can you show an example and share the sequence that lead to this ? Thanks

jiarong commented 2 years ago

Thanks for quick response. I have an example and the sequence in the attached file. example.if2.txt

jeanrjc commented 2 years ago

Hello,

running integron_finder --local-max example.if2.txt

Leads to :

ID_integron ID_replicon element pos_beg pos_end strand  evalue  type_elt    annotation  model   type    default distance_2attC  considered_topology
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1_2    1460    2782    -1  1.8e-25 protein intI    intersection_tyr_intI   complete    No  NA  lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1_3    3019    3447    1   NA  protein protein NA  complete    No  NA  lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  attc_001    3396    3522    1   9.2e-06 attC    attC    attc_4  complete    No  NA  lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  attc_002    3933    4058    1   9.3e-07 attC    attC    attc_4  complete    No  411.0   lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1_4    3992    4570    -1  NA  protein protein NA  complete    No  NA  lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  attc_003    4580    4705    1   9.6e-07 attC    attC    attc_4  complete    No  522.0   lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1_5    4633    5097    -1  NA  protein protein NA  complete    No  NA  lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  attc_004    5133    5257    1   8.9e-09 attC    attC    attc_4  complete    No  428.0   lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1_6    5303    5671    1   NA  protein protein NA  complete    No  NA  lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  attc_005    5682    5780    1   8.2e-06 attC    attC    attc_4  complete    No  425.0   lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1_7    5791    6246    1   NA  protein protein NA  complete    No  NA  lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  attc_006    6263    6390    1   3.7e-05 attC    attC    attc_4  complete    No  483.0   lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1_8    6360    6782    1   NA  protein protein NA  complete    No  NA  lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  attc_007    6790    6923    1   3.2e-07 attC    attC    attc_4  complete    No  400.0   lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1_9    6938    7285    1   NA  protein protein NA  complete    No  NA  lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  attc_008    7234    7359    1   1.5e-07 attC    attC    attc_4  complete    No  311.0   lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1_10   7363    7692    1   NA  protein protein NA  complete    No  NA  lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  attc_009    7700    7798    1   4.2e-06 attC    attC    attc_4  complete    No  341.0   lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1_11   7756    7968    -1  NA  protein protein NA  complete    No  NA  lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1_12   8072    8329    -1  NA  protein protein NA  complete    No  NA  lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  attc_010    8264    8362    1   1.3e-06 attC    attC    attc_4  complete    No  466.0   lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1_13   8376    8807    1   NA  protein protein NA  complete    No  NA  lin
integron_01 SS.fna.0_711E1DmetaG_2_FD_JGI_scaffold_3407_c1  attc_011    8809    8910    1   0.11    attC    attC    attc_4  complete    No  447.0   lin

All attC are on the same strand and opposed to the intI's strand. Some CDS between two attC sites can be on the other strand, that's not a problem.

jiarong commented 2 years ago

I see. It is normal to have gene cassette on the other strand. They just do NOT get benefit from the promoter Pc.