gem-pasteur / Integron_Finder

Bioinformatics tool to find integrons in bacterial genomes
GNU General Public License v3.0
67 stars 22 forks source link

[HELP] In silico IntI1 qPCR assay don't match results IntegronFinder #110

Closed LauraVP1994 closed 1 year ago

LauraVP1994 commented 1 year ago

Describe your problem I get some inconsistent results between IntegronFinder and what I expect based on primers and probes that should target integron 1 (in silico). I was thus wondering if you have any ideas why there are discrepancies? These primers and probes that should match with integron 1 are: intI1_FW: GCCTTGATGTTACCCGAGAG intI1_RV: GATCGGTCGAATGCGTGT intI1_Probe: (6FAM)ATTCCTGGCCGTGGTTCTGGGTTTT(BHQ1) https://www.sciencedirect.com/science/article/pii/S0043135421009143

For the matches between the target sequences and the primer sequences I allow 0.1 distance between the two except in the last 5 nucleotides of the primers.

Some examples:

I'm not an expert in AMR and integrons, so it would be great if you have any ideas why I see this differences!

To Reproduce Steps to reproduce the behavior:

  1. The exact command lines you use integron_finder --outdir 2023_08_17_WW_database_bacteria_integronfinderrecheck --cpu 4 --local-max --func-annot --gbk --pdf 2023_08_17_WW_database_bacteria_integronfinderrecheck.fasta

Please complete the following information):

OS:

Integron_Finder Version:

integron_finder --version integron_finder version 2.0.2 Using:

Authors:

Citation:

Néron, B.; Littner, E.; Haudiquet, M.; Perrin, A.; Cury, J.; Rocha, E.P.C. IntegronFinder 2.0: Identification and Analysis of Integrons across Bacteria, with a Focus on Antibiotic Resistance in Klebsiella. Microorganisms 2022, 10, 700. https://doi.org/10.3390/microorganisms10040700

If you use --func-annot in conjunction with file NCBIfam-AMRFinder.hmm please also cite

Haft, DH et al., Nucleic Acids Res. 2018 Jan 4;46(D1):D851-D860 PMID: 29112715

Thank you! Laura

jeanrjc commented 1 year ago

Hello,

could you share your input file (2023_08_17_WW_database_bacteria_integronfinderrecheck.fasta) ?

jeanrjc commented 1 year ago

Can you also share the output files you got ?

LauraVP1994 commented 1 year ago

You can find everything here: https://we.tl/t-3ihxgvesfk

jeanrjc commented 1 year ago

Hello, I just checked against CP065039 and there is an integron, why do you say there isn't ?

From your output:

integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_1999    2251626 2252225 1   6.6000000000000004e-130 protein AAC_6p_Ib-NCBIFAM   NF033074.0  CALIN   No  NA  lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_001    2252220 2252291 1   0.00041 attC    attC    attc_4  CALIN   No  NA  lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2000    2252311 2253186 1   2.2e-196    protein blaOXA-1_like-NCBIFAM   NF000388.2  CALIN   No  NA  lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_002    2253206 2253295 1   0.00075 attC    attC    attc_4  CALIN   No  915.0   lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2001    2253324 2253956 1   6.699999999999999e-135  protein chloram_CatB-NCBIFAM    NF000490.1  CALIN   No  NA  lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_003    2253951 2254010 1   9.7e-07 attC    attC    attc_4  CALIN   No  656.0   lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2002    2254041 2254493 1   3.3e-67 protein rifampin_ARR-NCBIFAM    NF033144.1  CALIN   No  NA  lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_004    2254500 2254613 1   0.27    attC    attC    attc_4  CALIN   No  490.0   lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2003    2254716 2255063 1   1.5999999999999999e-62  protein SMR_qac_E-NCBIFAM   NF000276.2  CALIN   No  NA  lin
integron_02 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_001    2260351 2260410 1   9.7e-07 attC    attC    attc_4  CALIN   No  NA  lin
integron_02 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2008    2260441 2260893 1   3.3e-67 protein rifampin_ARR-NCBIFAM    NF033144.1  CALIN   No  NA  lin
integron_02 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_002    2260900 2261013 1   0.27    attC    attC    attc_4  CALIN   No  490.0   lin
integron_02 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2009    2261116 2261463 1   1.5999999999999999e-62  protein SMR_qac_E-NCBIFAM   NF000276.2  CALIN   No  NA  lin
integron_03 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_001    2266751 2266810 1   9.7e-07 attC    attC    attc_4  CALIN   No  NA  lin
integron_03 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2014    2266841 2267293 1   3.3e-67 protein rifampin_ARR-NCBIFAM    NF033144.1  CALIN   No  NA  lin
integron_03 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_002    2267300 2267413 1   0.27    attC    attC    attc_4  CALIN   No  490.0   lin
integron_03 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2015    2267516 2267863 1   1.5999999999999999e-62  protein SMR_qac_E-NCBIFAM   NF000276.2  CALIN   No  NA  lin
integron_04 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2023    2274157 2275116 -1  3.1e-23 protein intI    intersection_tyr_intI   complete    No  NA  lin
integron_04 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2024    2275148 2275747 1   1.4000000000000001e-89  protein trim_DfrA1_like-NCBIFAM NF000330.1  complete    No  NA  lin
integron_04 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_001    2275742 2275874 1   3.8e-05 attC    attC    attc_4  complete    No  NA  lin
integron_04 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2025    2275940 2277160 1   1.9999999999999999e-286 protein EreA-NCBIFAM    NF000208.1  complete    No  NA  lin
integron_04 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2026    2277256 2278035 1   2.4e-152    protein ANT_3pp_I-NCBIFAM   NF012157.0  complete    No  NA  lin
integron_04 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_002    2278037 2278096 1   1.3e-07 attC    attC    attc_4  complete    No  2163.0  lin
integron_04 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2027    2278215 2279504 -1  NA  protein protein NA  complete    No  NA  lin

Integron_04 is a complete one.

What do you call a positive result ?

LauraVP1994 commented 1 year ago

Indeed those are the results. Maybe in that case I miss documentation on the output (I have looked for it but couldn't find it) when something is considered a positive result. Now I only included results from IntegronFinder as positive if there was "intI" in the annotation column.

I'm specifically looking for integron 1 (intI1) so I guess it's another kind of integron as integron_04?

jeanrjc commented 1 year ago

It depends on your question. IntegronFinder outputs what it finds. It can find 3 types of element, complete integron (intI+attC), In0 (only intI) and CALIN (cluster of attC site lacking intI). Then there is the question of the type of integron. IF tries to annotate known promoters and attI sites, but we might not be up to date on this. For instances, we have some promoter for type 1 integron here or here for attI sites for type 1, 2 and 3.

So if IF does not output any class information, it's just that IF has no way to know which class it is (based on the attI or promoter). Note that most integrons are not from any of the classes 1 to 5, which cover really a subset of integrons (see original paper from 2016, figure S2).

Overall, the 4th integron in the table above might be a type 1 integron. The column annotation just tells you what type of gene is on the corresponding line. E.g. gene GCF_018972025.2_CP065039.2|kraken:taxid|584_2023 is annotated as intI, while GCF_018972025.2_CP065039.2|kraken:taxid|584_2024 is annotated as a trim_DfrA1_like protein. Other columns tells you where those genes are in nucleotide position.

LauraVP1994 commented 1 year ago

OK, so that probably also explains the different results for CP053372.1? We are trying to use your tool as a validation for my tool + the PCR assay that the integron IntI1 is really there. How should we then use your tool to accomplish this?

Also what is the difference between a complete integron, ln0 and CALIN ? Do you only count complete integrons as really being there or ... ?

Thank you!

jeanrjc commented 1 year ago

Well I don't know your tool. IF can tell you whether there is an integron (complete, In0 or CALIN) in the sequence or not, and where this integron is. So based on this you can compare the position of the integron you're detecting with what's detected by IF at the same position.

LauraVP1994 commented 1 year ago

My tool checks in NCBI sequences whether it can find primer and probes that should target IntI1 intI1_FW: GCCTTGATGTTACCCGAGAG intI1_RV: GATCGGTCGAATGCGTGT intI1_Probe: (6FAM)ATTCCTGGCCGTGGTTCTGGGTTTT(BHQ1) https://www.sciencedirect.com/science/article/pii/S0043135421009143

So I want to verify that the positive results that I find are confirmed with another tool that it is indeed IntI1. And in case I don't have positive result, I want to verify with the other tool that indeed there is no IntI1. So, I don't know if IntegronFinder is then the appropriate tool? Is there a possibility to only find IntI1 or to filter them afterwards?

jeanrjc commented 1 year ago

So far, IF is not able to tell whether an integron of class I (except in rare case). I'm not up to date with respect to how people tell whether an integron is class 1 or not, but I think a simple blast against reference sequences of intI1 might be enough (with an identity threshold) to select intI1.

I'm closing this issue as it appears it's not really an integronfinder problem. If you'd like IF to annotate better class 1 integron, please post another issue as a feature request.