igvteam / igv

Integrative Genomics Viewer. Fast, efficient, scalable visualization tool for genomics data and annotations
https://igv.org
MIT License
645 stars 387 forks source link

Provide "Sort alignments by readname" #419

Closed mmokrejs closed 4 years ago

mmokrejs commented 7 years ago

I am comparing several BAM files from different aligner programs using same data. Surprisingly, I cannot ensure all tracks are sorted using readnames. Each aligner ended up with diiferent number of reads aligned, some are missing, some reads aligned in by multiple aligners however do start/end in alignment sooner or later. For this reason, I cannot sort using "Start position", etc. Sorting by readname seems the best in this situation, ideally if on top of that and additional round of sorting using start position of a read in track number 1 would be done (so reads in tracks > #1 would appear at same row-position but most likely with an alignment starting later than in track #1).

jrobinso commented 4 years ago

I realize this ticket is more than 2 years old, apologies it just got buried my more urgent issues. This is rather complex and esoteric, and I don't think it would work well in practice. Sorting just rearranges rows according to a value at a specific base pair location.

kelgalla commented 4 years ago

I would also find this helpful. I have a bam file that I ran through a realigner for better detection of indels and now I want to compare this bam vs. realigned bam in IGV at certain locations. It is the exact same sample with the same read names. It would be great to have a sort by read name so you can actually see the same order of reads lined up in 2 different tracks or even in a merged view where I color by original bam file vs. realigned bam file. I was mostly looking for this since we'd like to publish some pictures of the improved alignment (before vs. after) in IGV, but since the reads are not ordered the same way, it is impossible to really show this.

jrobinso commented 4 years ago

@kelgalla As mentioned in the comment above sorting will only assure that alignments that intersect the centerline of the view will appear in the same order, or more precisely be ordered by readname. Alignments that don't intersect the center will be arbitrarily placed in rows. If you do a re-alignment obviously reads move, some reads are no longer aligned at all and vice versa, there is no way you are going to get reads packed in the same order everywhere since by definition the alignments have changed. So with this in mind is sorting reads that intersect a specific location still useful?

kelgalla commented 4 years ago

Correct, I understand some reads will move around, and I don't want the order the same everywhere, just in that focal area where they intersect the centerline as you mention. I understand this only makes sense in the special circumstance such as where you are trying to show how your aligner soft-clipped a bunch of areas where it should have called an indel, and then when you run that bam through another program, it then was able to fix all the soft clips and call them now as indels. Top track shows reads in that area with indels + many reads with soft clips, bottom track shows reads in that area with only indels and no soft clips, but the reads are in different order so harder to visually see the difference/prove your point. Pic if it helps below. Not all reads would be the exactly the same likely in both views at that location, but it would probably do a decent job for those that are the same. Similarly, if I did a merged view of the two files and gave them a different sample name and color, and sorted by read name, I could see the reads next to each other that did still both map there.

temp

However, to have all the indels line up nicely, you would need a sort by base (as in the picture currently), then by read name......

jrobinso commented 4 years ago

There's no support for sorting by 2 attributes yet, but I did push the sort-by-readname. It will be in the next release, and earlier in the "nightly snapshot", should be included now in fact.

kelgalla commented 4 years ago

Sounds great, looking forward to trying it out!