Using cDNA reads to polish exons of a genome

nanoporetech / medaka

Sequence correction provided by ONT Research

Other

409 stars 74 forks source link

Hi! I have accurate long cDNA reads which I want to use as input for genome polishing, of course expecting that they will only fix exons across the draft reference. I was wondering if medaka is an appropriate tool to perform this task.

After reading the documentation about minimap2 and its output, I found that regions of the spliced alignments of cDNA-Reference could be tagged as skipped region from the reference using the "N" character on the CIGAR string. For mRNA-to-genome alignment, is suggested that an "N" operation represents an intron.

So I want to ask if medaka considers these kind of alignments to avoid merging contiguous exons; or in other words, if it differentiates between real introns, which are not intended to be removed, and insertions on draft's exons, which should be polished.

nanoporetech / medaka

Using cDNA reads to polish exons of a genome #421