question about mikado pick

lijing28101 commented 4 years ago

Hi, I have a question about the mikado pick algorithm. When I run transdecoder on mikado_prepared.fasta, I noticed that transdecoder will keep the longest CDS and delete the short one within long CDS. I'm curious that whether mikado also keep the long CDS, if I manually keep both CDS as input of ORF? Since I'm interested on orphan genes, and find some orphan genes just within the known gene but much shorter. If the CDS have totally different protein homolog as the longer one, I still want to keep it.

lucventurini commented 4 years ago

Hi, I have a question about the mikado pick algorithm. When I run transdecoder on mikado_prepared.fasta, I noticed that transdecoder will keep the longest CDS and delete the short one within long CDS. I'm curious that whether mikado also keep the long CDS, if I manually keep both CDS as input of ORF? Since I'm interested on orphan genes, and find some orphan genes just within the known gene but much shorter. If the CDS have totally different protein homolog as the longer one, I still want to keep it.

Dear @lijing28101

I'm curious that whether mikado also keep the long CDS, if I manually keep both CDS as input of ORF? Since I'm interested on orphan genes, and find some orphan genes just within the known gene but much shorter. If the CDS have totally different protein homolog as the longer one, I still want to keep it.

Mikado will consider as potentially valid all ORFs found for a transcript, as long as they are not overlapping each other. So if e.g. you had a long ORF in the middle (complete), a shorter ORF upstream or downstream and non overlapping the first ORF, and a third ORF which is overlapping either, Mikado will in general load into the transcript the first and the second, but not the third. The only exception is when the second ORF is very short (default shorter than 250bps).

Further details: https://mikado.readthedocs.io/en/latest/Algorithms.html

What Mikado does with the ORFs is determined by the mikado pick mode. My understanding is that probably you want to run Mikado in either split or permissive mode, to consider each ORF as a separate transcript. See here for further details: https://mikado.readthedocs.io/en/latest/Usage/Configure.html#chimera-splitting

I hope this helps.

lucventurini commented 4 years ago

Closing for now due to lack of activity.

EI-CoreBioinformatics / mikado

question about mikado pick #293