Closed bj8th closed 3 years ago
I've seen this bug come up in some of the PacBio data. Could you list some specific examples?
@jsaquing I do not have one handy right now. To find an example you can call db_session.query(Transcript).all() then for each transcript call transcript.orfs[0] Doing this would eventually find a bug.
You can also write some unit tests to check the ORF start and stop for + and - strands
Transcripts that exhibit bug:
ERROR:root:Bad Transcript ENST00000316485.11 ERROR:root:Bad Transcript ENST00000378125.4 ERROR:root:Bad Transcript ENST00000396846.8 ERROR:root:Bad Transcript ENST00000429238.2 ERROR:root:Bad Transcript ENST00000452392.2 ERROR:root:Bad Transcript ENST00000494651.7 ERROR:root:Bad Transcript ENST00000519718.1 ERROR:root:Bad Transcript ENST00000549499.1 ERROR:root:Bad Transcript ENST00000551025.4 ERROR:root:Bad Transcript ENST00000594769.5 ERROR:root:Bad Transcript ENST00000596400.1 ERROR:root:Bad Transcript ENST00000597658.1 ERROR:root:Bad Transcript ENST00000611392.5 ERROR:root:Bad Transcript ENST00000618573.4 ERROR:root:Bad Transcript ENST00000622608.1 ERROR:root:Bad Transcript ENST00000638375.1 ERROR:root:Bad Transcript ENST00000639222.1 ERROR:root:Bad Transcript ENST00000639419.1 ERROR:root:Bad Transcript ENST00000639615.1 ERROR:root:Bad Transcript ENST00000639793.1 ERROR:root:Bad Transcript ENST00000639929.1 ERROR:root:Bad Transcript ENST00000640103.1 ERROR:root:Bad Transcript ENST00000640432.1 ERROR:root:Bad Transcript ENST00000650585.1 ERROR:root:Bad Transcript ENST00000681176.1
ORF stop location was outside of Transcript sequence length.