Closed lijing28101 closed 4 years ago
Hi, I have a question about the mikado pick algorithm. When I run transdecoder on mikado_prepared.fasta, I noticed that transdecoder will keep the longest CDS and delete the short one within long CDS. I'm curious that whether mikado also keep the long CDS, if I manually keep both CDS as input of ORF? Since I'm interested on orphan genes, and find some orphan genes just within the known gene but much shorter. If the CDS have totally different protein homolog as the longer one, I still want to keep it.
Dear @lijing28101
I'm curious that whether mikado also keep the long CDS, if I manually keep both CDS as input of ORF? Since I'm interested on orphan genes, and find some orphan genes just within the known gene but much shorter. If the CDS have totally different protein homolog as the longer one, I still want to keep it.
Mikado will consider as potentially valid all ORFs found for a transcript, as long as they are not overlapping each other. So if e.g. you had a long ORF in the middle (complete), a shorter ORF upstream or downstream and non overlapping the first ORF, and a third ORF which is overlapping either, Mikado will in general load into the transcript the first and the second, but not the third. The only exception is when the second ORF is very short (default shorter than 250bps).
Further details: https://mikado.readthedocs.io/en/latest/Algorithms.html
What Mikado does with the ORFs is determined by the mikado pick mode. My understanding is that probably you want to run Mikado in either split
or permissive
mode, to consider each ORF as a separate transcript. See here for further details: https://mikado.readthedocs.io/en/latest/Usage/Configure.html#chimera-splitting
I hope this helps.
Closing for now due to lack of activity.
Hi, I have a question about the mikado pick algorithm. When I run transdecoder on mikado_prepared.fasta, I noticed that transdecoder will keep the longest CDS and delete the short one within long CDS. I'm curious that whether mikado also keep the long CDS, if I manually keep both CDS as input of ORF? Since I'm interested on orphan genes, and find some orphan genes just within the known gene but much shorter. If the CDS have totally different protein homolog as the longer one, I still want to keep it.