Closed LauraVP1994 closed 1 year ago
Hello,
could you share your input file (2023_08_17_WW_database_bacteria_integronfinderrecheck.fasta
) ?
Can you also share the output files you got ?
You can find everything here: https://we.tl/t-3ihxgvesfk
Hello, I just checked against CP065039 and there is an integron, why do you say there isn't ?
From your output:
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_1999 2251626 2252225 1 6.6000000000000004e-130 protein AAC_6p_Ib-NCBIFAM NF033074.0 CALIN No NA lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_001 2252220 2252291 1 0.00041 attC attC attc_4 CALIN No NA lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2000 2252311 2253186 1 2.2e-196 protein blaOXA-1_like-NCBIFAM NF000388.2 CALIN No NA lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_002 2253206 2253295 1 0.00075 attC attC attc_4 CALIN No 915.0 lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2001 2253324 2253956 1 6.699999999999999e-135 protein chloram_CatB-NCBIFAM NF000490.1 CALIN No NA lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_003 2253951 2254010 1 9.7e-07 attC attC attc_4 CALIN No 656.0 lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2002 2254041 2254493 1 3.3e-67 protein rifampin_ARR-NCBIFAM NF033144.1 CALIN No NA lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_004 2254500 2254613 1 0.27 attC attC attc_4 CALIN No 490.0 lin
integron_01 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2003 2254716 2255063 1 1.5999999999999999e-62 protein SMR_qac_E-NCBIFAM NF000276.2 CALIN No NA lin
integron_02 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_001 2260351 2260410 1 9.7e-07 attC attC attc_4 CALIN No NA lin
integron_02 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2008 2260441 2260893 1 3.3e-67 protein rifampin_ARR-NCBIFAM NF033144.1 CALIN No NA lin
integron_02 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_002 2260900 2261013 1 0.27 attC attC attc_4 CALIN No 490.0 lin
integron_02 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2009 2261116 2261463 1 1.5999999999999999e-62 protein SMR_qac_E-NCBIFAM NF000276.2 CALIN No NA lin
integron_03 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_001 2266751 2266810 1 9.7e-07 attC attC attc_4 CALIN No NA lin
integron_03 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2014 2266841 2267293 1 3.3e-67 protein rifampin_ARR-NCBIFAM NF033144.1 CALIN No NA lin
integron_03 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_002 2267300 2267413 1 0.27 attC attC attc_4 CALIN No 490.0 lin
integron_03 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2015 2267516 2267863 1 1.5999999999999999e-62 protein SMR_qac_E-NCBIFAM NF000276.2 CALIN No NA lin
integron_04 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2023 2274157 2275116 -1 3.1e-23 protein intI intersection_tyr_intI complete No NA lin
integron_04 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2024 2275148 2275747 1 1.4000000000000001e-89 protein trim_DfrA1_like-NCBIFAM NF000330.1 complete No NA lin
integron_04 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_001 2275742 2275874 1 3.8e-05 attC attC attc_4 complete No NA lin
integron_04 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2025 2275940 2277160 1 1.9999999999999999e-286 protein EreA-NCBIFAM NF000208.1 complete No NA lin
integron_04 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2026 2277256 2278035 1 2.4e-152 protein ANT_3pp_I-NCBIFAM NF012157.0 complete No NA lin
integron_04 GCF_018972025.2_CP065039.2|kraken:taxid|584 attc_002 2278037 2278096 1 1.3e-07 attC attC attc_4 complete No 2163.0 lin
integron_04 GCF_018972025.2_CP065039.2|kraken:taxid|584 GCF_018972025.2_CP065039.2|kraken:taxid|584_2027 2278215 2279504 -1 NA protein protein NA complete No NA lin
Integron_04 is a complete one.
What do you call a positive result ?
Indeed those are the results. Maybe in that case I miss documentation on the output (I have looked for it but couldn't find it) when something is considered a positive result. Now I only included results from IntegronFinder as positive if there was "intI" in the annotation column.
I'm specifically looking for integron 1 (intI1) so I guess it's another kind of integron as integron_04?
It depends on your question. IntegronFinder outputs what it finds. It can find 3 types of element, complete integron (intI+attC), In0 (only intI) and CALIN (cluster of attC site lacking intI). Then there is the question of the type of integron. IF tries to annotate known promoters and attI sites, but we might not be up to date on this. For instances, we have some promoter for type 1 integron here or here for attI sites for type 1, 2 and 3.
So if IF does not output any class information, it's just that IF has no way to know which class it is (based on the attI or promoter). Note that most integrons are not from any of the classes 1 to 5, which cover really a subset of integrons (see original paper from 2016, figure S2).
Overall, the 4th integron in the table above might be a type 1 integron. The column annotation just tells you what type of gene is on the corresponding line. E.g. gene GCF_018972025.2_CP065039.2|kraken:taxid|584_2023
is annotated as intI, while GCF_018972025.2_CP065039.2|kraken:taxid|584_2024
is annotated as a trim_DfrA1_like protein. Other columns tells you where those genes are in nucleotide position.
OK, so that probably also explains the different results for CP053372.1? We are trying to use your tool as a validation for my tool + the PCR assay that the integron IntI1 is really there. How should we then use your tool to accomplish this?
Also what is the difference between a complete integron, ln0 and CALIN ? Do you only count complete integrons as really being there or ... ?
Thank you!
Well I don't know your tool. IF can tell you whether there is an integron (complete, In0 or CALIN) in the sequence or not, and where this integron is. So based on this you can compare the position of the integron you're detecting with what's detected by IF at the same position.
My tool checks in NCBI sequences whether it can find primer and probes that should target IntI1 intI1_FW: GCCTTGATGTTACCCGAGAG intI1_RV: GATCGGTCGAATGCGTGT intI1_Probe: (6FAM)ATTCCTGGCCGTGGTTCTGGGTTTT(BHQ1) https://www.sciencedirect.com/science/article/pii/S0043135421009143
So I want to verify that the positive results that I find are confirmed with another tool that it is indeed IntI1. And in case I don't have positive result, I want to verify with the other tool that indeed there is no IntI1. So, I don't know if IntegronFinder is then the appropriate tool? Is there a possibility to only find IntI1 or to filter them afterwards?
So far, IF is not able to tell whether an integron of class I (except in rare case). I'm not up to date with respect to how people tell whether an integron is class 1 or not, but I think a simple blast against reference sequences of intI1 might be enough (with an identity threshold) to select intI1.
I'm closing this issue as it appears it's not really an integronfinder problem. If you'd like IF to annotate better class 1 integron, please post another issue as a feature request.
Describe your problem I get some inconsistent results between IntegronFinder and what I expect based on primers and probes that should target integron 1 (in silico). I was thus wondering if you have any ideas why there are discrepancies? These primers and probes that should match with integron 1 are: intI1_FW: GCCTTGATGTTACCCGAGAG intI1_RV: GATCGGTCGAATGCGTGT intI1_Probe: (6FAM)ATTCCTGGCCGTGGTTCTGGGTTTT(BHQ1) https://www.sciencedirect.com/science/article/pii/S0043135421009143
For the matches between the target sequences and the primer sequences I allow 0.1 distance between the two except in the last 5 nucleotides of the primers.
Some examples:
I'm not an expert in AMR and integrons, so it would be great if you have any ideas why I see this differences!
To Reproduce Steps to reproduce the behavior:
Please complete the following information):
OS:
Integron_Finder Version:
integron_finder --version integron_finder version 2.0.2 Using:
Python 3.9.16 | packaged by conda-forge | (main, Feb 1 2023, 21:39:03) [GCC 11.3.0]
numpy 1.21.6
pandas 1.3.5
matplolib 3.3.3
biopython 1.78
Prodigal V2.6.3: February, 2016
INFERNAL 1.1.4 (Dec 2020)
HMMER 3.3.2 (Nov 2020); http://hmmer.org/
Authors:
Citation:
Néron, B.; Littner, E.; Haudiquet, M.; Perrin, A.; Cury, J.; Rocha, E.P.C. IntegronFinder 2.0: Identification and Analysis of Integrons across Bacteria, with a Focus on Antibiotic Resistance in Klebsiella. Microorganisms 2022, 10, 700. https://doi.org/10.3390/microorganisms10040700
If you use --func-annot in conjunction with file NCBIfam-AMRFinder.hmm please also cite
Haft, DH et al., Nucleic Acids Res. 2018 Jan 4;46(D1):D851-D860 PMID: 29112715
Thank you! Laura