RemiAllio / MitoFinder

MitoFinder: efficient automated large-scale extraction of mitogenomic data from high throughput sequencing data
86 stars 14 forks source link

Questions about the circularization check #58

Open Nautilus96 opened 3 months ago

Nautilus96 commented 3 months ago

Hi,

Firstly, I want to thank you for this amazing tool! I've been using it consistently during my PhD, and it has been immensely helpful!

I have a few questions about the circularization check process. I work on ciliates, which have linear mitogenomes capped with telomeres. Recently, I have been using MitoFinder to extract full mitochondrial genomes from multiple WGS assemblies, and in various cases MitoFinder finds evidence for circularization, which is at odds with what one would expect in ciliates. I cannot help but wonder if this is an artifact of how MitoFinder is designed, or if it is indeed a biological result. So I wanted to ask you about how exactly the circularization check works and if it is possible that it could lead to false positives?

Kind regards, Sebastian

RemiAllio commented 2 months ago

Hi Sebastian, Thank you for your positive feedback! :slightly_smiling_face:

Your result is interesting. I don't know if it can be a false positive result... The way it works is as follows: MitoFinder blast the two ends of the contig and check whether these portions are identical. If so, it suggests that the two ends are indeed the same portion and, when we expect a circular genome, it means that the genome is full and can be circularized. So, if MitoFinder finds evidence for circularisation, it means that the two ends of your contig are identical. It might be due to the fact that the original sequence is indeed circular but I can't state it for sure ...

A good way to test it would be to have several individuals for the same species. If circularisation is found every time, even if the contig doesn't start with the same sequence portion, then it would definitely suggest that the genome is circular. On the other hand, if you know the telomeres' sequences you can try to see where they are in your contig. If they are in the middle of some contigs and circularisation is found, it might also suggest that the genome is circular...

Hope this helps! Best, Rémi