ablab / IsoQuant

Transcript discovery and quantification with long RNA reads (Nanopores and PacBio)
https://ablab.github.io/IsoQuant/
Other
148 stars 13 forks source link

Issue in identifying 3′ end position #92

Open seungbeom-han opened 1 year ago

seungbeom-han commented 1 year ago

Hello,

While comparing my genome alignments and transcript model, I have found out that IsoQuant 3.2.0 incorrectly assigns transcript model to annotated transcript if only 3′ end site differs.

One reason I'm suspecting is that exon/intron profiles of transcript model only is considered to determine annotated transcripts corresponding to the transcript model. I would be grateful if you tell me if this is the cause of the problem and whether this can be resolved. https://github.com/ablab/IsoQuant/blob/ec63766073341ac9997657ceb0b1a3ca98003b6a/src/long_read_assigner.py#L439-L453

For your information, I share an IGV screenshot of genome alignments on RPL10 gene and transcript models built by IsoQuant. The one above is built from the entire alignment, and the below one is built only from alignments mapped to RPL10 gene region.

Sincerely, Seungbeom Han

image

andrewprzh commented 1 year ago

Dear @seungbeom-han

Yes, the is a problem with respect to transcripts ends that is on our TODO list.

When IsoQuant assigns reads to known isoforms it does account for difference in 5'/3' ends. However, if a discovered transcript model has an intron chain identical to a known transcript intron chain, IsoQuant uses known transcript model, including this end. The solution would be to correct transcript models ends according to the reads. I'll try to get my hands on this issue as soon as possible.

Also, read classification for generated transcript models was fixed in 3.3.0, but the problem with transcript models still exists.

Best Andrey

seungbeom-han commented 1 year ago

Dear @andrewprzh

Thank you for the fast reply. If then, is the only way available now is to collect read assignments from IsoQuant 3.3.0, seek alternative polyadenylation against generated transcript models, and finalize it?

Seungbeom

andrewprzh commented 1 year ago

Dear @seungbeom-han

At the moment, I'd say so. Meanwhile I'll do some experiments to figure out whether I can fix it quickly. Thanks for bringing my attention to this.

Best Andrey

seungbeom-han commented 1 year ago

Thank you. I'm looking for the next update!