xiezhq / ISEScan

A python pipeline to identify IS (Insertion Sequence) elements in genome and metagenome
Apache License 2.0
79 stars 17 forks source link

Question: IS associated with antibiotic resistance genes (ARGs) #50

Closed wanjinhu closed 1 year ago

wanjinhu commented 1 year ago

Hi,

ISEScan did a really great job, I have a question about IS associated with antibiotic resistance genes (ARGs). Here is my specific description:

Purpose: Are ARGs on the genome caused by mobile elements? Process:

  1. Assembly a complete genome and used Prodigal to predict genes;
  2. Annotating CDS using CARD database to find ARGs;
  3. I want to know are these ARGs caused by mobile elements (e.g. IS)?
  4. Use ISEScan to identify ISs on the same genome.

What I understand is that when the interval of ARG overlaps the interval of IS, especially when IS interval contains ARG, I think that this ARG may be brought by IS (IS may be composed of several genes).

If my understanding is right, there is one situation that confuses me:

One of ARGs is from 208885 to 209739, and ISEScan predict 209554 to 211556 is an IS, they have a piece of overlap sequence, I wonder how to explain this result?

Or what I should do is directly use different databases including IS database (e.g. ISfinder) to annotate CDS, since I think most IS are genes.

I look forward to your reply, thanks

Wanjin Hu

xiezhq commented 1 year ago

Wanjin,

My comments on your questions:

  1. Most of IS has only one gene (transposase), a few IS have accessory genes (like ARG) and transposase gene flanked by TIRs. You can have a look at the information at https://isfinder.biotoul.fr/IS_Infos/1_14.php.

  2. There is another case where an DNA fragment is flanked by two IS elements (usually the two copies of the same type of IS element), namely, composite transposon. The composite transposon may contain ARGs.

When you found 'One of ARGs is from 208885 to 209739, and ISEScan predict 209554 to 211556 is an IS, they have a piece of overlap sequence', you need to make sure the predicted/annotated gene boundaries are correct. FragGeneScan used in ISEScan to predict gene sometimes reporte incorrect gene prediction, e.g. reporting a very long gene.

For the reliable CDS, I don't think there perfect CDS annotation tools but ISfinder database and NCBI reference database is a good place to try.

Hope this hopes.

Xie

wanjinhu commented 1 year ago

Zhiqun,

Thanks for your reply, what I understand about ISEScan is to find DNA fragments that may belong to other genomes, as you mentioned, most of IS has only one gene (transposase). Still, I want to know what gene of my genome may be brought by IS event, so when I finished ISEScan, I got IS fragments, nucleic acid sequence of IS copies and amino acid sequence of ORFs, so I annotate ORF using different functional databases. Do you think my process is right?

I think I kindly know my confusion what I mentioned yesterday, because I used a different gene predict tool, such as Prodigal, it can tell me all predict genes in the genome, but ISEScan use FragGeneScan, it may get different results.

Anyway, thanks a lot.

Wanjin Hu

xiezhq commented 1 year ago
  1. IS element can not only transpose between genomes but also intra-genome.

  2. "so when I finished ISEScan, I got IS fragments, nucleic acid sequence of IS copies and amino acid sequence of ORFs, so I annotate ORF using different functional databases. "

I dont understand what you are talking about. If you did "I got .... and amino acid sequence of ORFs", why did you annotate ORF again using different functional databases?

wanjinhu commented 1 year ago

Hi,

Because ISEScan obtained the protein sequence of the ORF, but it still does not know what the specific gene is. In fact, I want to see if these protein sequences can be compared with known proteins? For example, NCBI's ORFfinder tool can use the BLAST method to compare with the Swissport database. The CARD is an ARGs database. I want to blast it with the CARD database to see if there are resistance genes on the IS I found. I understand that ORFs are not necessarily all CDS I think.

Wanjin Hu