Closed MiphaZ closed 2 years ago
hi @MiphaZ , may I know what is the typical entry of the contig column in your nanopolish eventalign.txt? This might be related to an incorrect string split on the annotation results
Sure.
ENST00000455464.7|ENSG00000237094.12|OTTHUMG00000002857.7|OTTHUMT00000346907.3|ENST00000455464|ENSG00000237094|902|processed_transcript|
What's more, is it reasonable using a old version m6anet-dataprep to run new m6anet-run_inference?
hi @MiphaZ, sorry for the late reply as I was travelling until recently.
Also I think the PAR_Y comes from your annotation files. The older version of m6Anet will split the "." in the contig column so that all transcripts such as ENSTXXX.Y will be parsed as ENSTXXX. We have removed this functionality in the newer version so as to be consistent with the annotations used
Also, it is reasonable to use the old version of m6anet-dataprep to run m6anet-run_inference but just be aware that there was a minor bug in older version that will exclude a tiny portion of the candidate sites compared to the new version
It helps a lot ,thank you.
Hi Developer!
In your latest version , some transcripts' id include both PAR_Y and normal version . But in old version , those messages were cut off . So what does the old id mean ?
Old version: ENST00000381192 166 29 0.08692147 Latest version: ENST00000381192.10|ENSG00000002586.20|OTTHUMG00000021073.12|OTTHUMT00000055624.3|CD99-205|CD99|1129|protein_coding| 220 20 0.103975390 TGACT 0.05000000 ENST00000381192.10_PAR_Y|ENSG00000002586.20_PAR_Y|OTTHUMG00000021073.12|OTTHUMT00000055624.3|CD99-205|CD99|1129|protein_coding| 226 20 0.279130700 TGACT 0.10000000