xiezhq / ISEScan

A python pipeline to identify IS (Insertion Sequence) elements in genome and metagenome
Apache License 2.0
79 stars 17 forks source link

How can we plotting/mapping many of our ISs from ISEScan results into a reference genome? #43

Closed rickyalfaray closed 2 years ago

rickyalfaray commented 2 years ago

Dear Mr. @xiezhq ,

First of all, I would like to thank you for making this nice tool that conveniently can be used even for a beginner like me. It is really helpful. I have one question regarding how can we plot/map many of our Insertion sequences (ISs) from ISEScan results into a reference genome. I already got results from 2000 WGS of H. pylori using ISEScan. Now, I would like to make a graph (any kind of graph is ok) that can give us information about:

  1. where is ISs position relatively in a reference genome (e.g., a strain named HP26695), so I can know what genes are mostly affected by these ISs.
  2. how often they are found in that position/gene (e.g., IS605 found in gene A in 1500 out of 2000 WGS, or can not specify the IS type is also ok, so it will be just like: geneA have ISs in 1500 WGS, the other 500 WGS don't have any IS).

In my plan, it should be like: the X-axis is the position of the genome (0 - 1,5Mb) and the Y-axis should be the frequency of the ISs in each position (I hope to make like figure 7 in this paper), which I attached, but I don't know how to make it in the more easy way since I'm new in this field.

Could you help to give me some suggestions? Thank you.

Kind regards, Ricky

GanbatteRicky

xiezhq commented 2 years ago

Hi rickyalfaray,

Thanks for your interest in ISEScan. The current version of ISEScan has no features to produce the figures or plots, but you can try mapping the identified IS elements in your genome sequence to your reference genome sequence.

For your question/task 1

If you want to generate the genomic position of the identified IS elements in the reference genome, you first need to align your genome to reference genome to get the mapping between your genome sequence and reference genome sequence. After the sequence alignment/mapping, you can map the genomic position of the identified IS elements in your genome sequence to the reference genome sequence. You thus obtian the IS elements position relatively in a reference genome. For the convenience to complete your question/task 2, it would be good to save those mapping positions in a table, e.g. a .csv file.

For your question/task 2

Based on the mapping positions in the table you obtianed in aforementioned step, you can manually or use some Microsoft excel software to edit and/or produce the plots.

Hope this is helpful for your project. Because of the lack of funding support, I don't have much time to incorporate new features into ISEScan in a timely style. I would be happy to invite others to be the new contributors of ISEScan, if anyone would like to contribute codes/features to ISEScan.

Zhiqun Xie

rickyalfaray commented 2 years ago

Thank you so much for your kind reply. I already finished the analysis according to your suggestion. Thank you.