igvteam / igv

Integrative Genomics Viewer. Fast, efficient, scalable visualization tool for genomics data and annotations
https://igv.org
MIT License
647 stars 386 forks source link

Show Only Reads Overlapping Regions #753

Open DarioS opened 4 years ago

DarioS commented 4 years ago

It would be powerful to restrict the alignments shown in the alignment track to only those defined by the user in Region Navigator.

An example use case is if there are two SNVs in a tumour suppressor gene 160 bases apart. Each SNV could be made into a single base region with Region Navigator. With read pairs of 150 bases each read, it's easy to visually phase these SNVs as being on the same chromosome or the other copy of it. Not many pairs of reads hit both SNVs, so being able to show only the reads that do could make for a nice supplementary material plot, if it was easy to do. For the gene in my cancer patient, the two SNVs are mutually exclusive within a read pair, which supports the two-hit model of tumour suppressor gene inactivation.

jrobinso commented 4 years ago

@DarioS interesting idea. How do you imagine specifying this? Its a bit esoteric, I probably wouldn't want to add it to the already very long alignment track menu.

DarioS commented 4 years ago

I imagine it would be a low priority, perhaps one or two after 3.0, but perhaps a bit higher for other people who have PacBio Circular Consensus Sequencing data and can phase across a much larger distance than the more popular Illumina TruSeq.

Because it's something you'd want to turn on and off for different regions than keep constant for an entire viewing session, I don't imagine in being in the Alignments tab of Preferences, but in the right-click context menu of a BAM track (since it would likely only be relevant to a very small proportion of samples rather than all of them). There's already Group Alignments By ... and Sort Alignments By ..., so perhaps there could be Filter Alignments By ... with a choice such as Overlapping User Regions (which would need to be able to be linked to others somehow) and, now that I think about it, Only Chimeric (STAR's user manual states that --chimOutType SeparateSAMold "will be deprecated in the future, and the --chimOutType WithinBAM is strongly recommended." which makes fusion visualisation hard and cluttered if low frequency compared to non-chimeric arrangement).

jrobinso commented 4 years ago

@DarioS BTW have you noticed the "quick phasing" item in the menu. I thought not, like I said the menu is too long. Anyway it is there specifically for long read data. It uses a quick k-means style clustering with the distance metric being the number of snps in common. Or something like that its been a while since I added it. Anyway, if your example scenario is a real one you have data for give it a try.

DarioS commented 4 years ago

I carefully looked for it but I can't see that anywhere on Alignments or Third Gen tabs.

jrobinso commented 4 years ago

@DarioS its not a preference, its a menu item (right-click popup). However I just looked at it and there are some issues. I'll look into it.

Screen Shot 2020-03-07 at 1 03 29 PM
jrobinso commented 4 years ago

@DarioS its working as designed, but I just pushed a change with more user feedback when things go wrong. The clustering works strictly on snps at the moment, indels and insertions are ignored. This will be fixed.

DarioS commented 4 years ago

Great, I'll leave this opened until indels are incorporated to the phasing process.