mcfrith / last-rna

MIT License
48 stars 6 forks source link

sam/bam files describing spliced alignments by N in cigar #3

Closed yuifu closed 1 year ago

yuifu commented 6 years ago

Thank you for sharing the recipe https://github.com/mcfrith/last-rna/blob/master/last-long-reads.md.

I've aligned long RNA reads to genome, and used maf-convert to obtain sam/bam files. The sam/bam files contains each spliced region in a read as different entries, but I want a bam file where spliced alignments are described by 'N' in cigar (one entry for one spliced alignment). Is there any option (or future plan) to make maf-convert generate such sam/bam files?

mcfrith commented 6 years ago

Sorry for this slow reply.

I'm afraid maf-convert can't do that at present. Future plan: maybe. Let me mention a couple of points (excuses, really):

Sometimes, exons are not co-linear (e.g. gene fusions in cancer), which cannot be represented as you wish.

It's common that one exon is aligned with high confidence (because it's long and non-repetitive), while another exon of the same RNA is aligned with much lower confidence (e.g. if it aligns to a tandem repeat). By keeping the exons separate, it's easier for them to have separate confidence values, which I think is important.

mcfrith commented 1 year ago

This is fixed, at least partly, in LAST version 1447: use maf-convert -j.