mlbendall / telescope

Quantification of transposable element expression using RNA-seq
MIT License
63 stars 16 forks source link

Clarification of required BAM sorting method #22

Closed rhughwhite closed 1 year ago

rhughwhite commented 3 years ago

Hi there,

Can I clarify this requirement from the README:

The alignment file must be in SAM or BAM format must be collated so that all alignments for a read pair appear sequentially in the file.

Should BAM files be sorted by read name? i.e. samtools sort -n

Will using coordinate sorted BAMs give incorrect results?

Thanks!

AssumeAssume commented 3 years ago

Hi, rhughwhite, I couldn't agree with you any more. I also have the same question.

mlbendall commented 1 year ago

Updated the README (see 4cf1859), hopefully should be more clear.

Alignments in the BAM file must be ordered so that all alignments for a read pair appear sequentially in the file - coordinate-sorted BAMs do not work. The default SAM/BAM output for many aligners is in the correct order, or BAM files can be sorted by read name (samtools sort -n). A faster alternative to a full read name sort is samtools collate.