Closed holtgrewe closed 8 months ago
Overall, we should either parse the RCRS entry from NucCore or use the ENSEMBL transcripts. Probably the latter is better as we otherwise won't have transcripts but would have to use the gene name for transcripts.
RNA genes from chrMT are not properly in CDOT https://github.com/SACGF/cdot/issues/72
The following chrMT transcripts have a CDS that is not a multiple of 3.
ENST00000361789.2
ENST00000361453.3
ENST00000361227.2
ENST00000361381.2
ENST00000361390.2
ENST00000362079.2
Poly-A is appended to the transcripts which we can emulate by padding the transcripts accordingly and adjusting the CDS on the fly.
Describe the bug RefSeq does not have transcripts for mitochondrial genes. Consequently, they are missing from CDOT.
To Reproduce Steps to reproduce the behavior:
Expected behavior We need chrMT transcripts in Mehari.
Additional context