nanoporetech / pychopper

A tool to identify, orient, trim and rescue full length cDNA reads
Other
80 stars 22 forks source link

getting strand information for 'unclassified' reads #19

Closed michieitel closed 5 years ago

michieitel commented 5 years ago

Hi,

is possible to get strand information for PCR-cDNA ONT reads that do have just one of the adapters, i.e. they are not full-length but might be still useful for gene annotation in the pinfish pipeline. I have these stats from pychopper: + - unclassified 6539750 6571674 5242864

Based on this I basically throw away 1/3 of the reads since they are not full-length transcripts.

I am wondering if there are plans to implement those reads with 'reduced' information value or if you are specifically aiming for full transcript reads only?

Any insights are appreciated!

Michael

bsipos commented 5 years ago

Trying to make use of reads during (re)annotation which are not detected as full length is not that useful if the transcript are already covered by enough full length reads. If you are interested in counting the transcripts it is better not to select for full length read. I do not plan to implement the feature you described, however from the pychopper scores output (-A) you could figure out the strand.

michieitel commented 5 years ago

Thanks!