igvteam / igv

Integrative Genomics Viewer. Fast, efficient, scalable visualization tool for genomics data and annotations
https://igv.org
MIT License
632 stars 379 forks source link

sort by aligned length in batch script #1223

Open anoronh4 opened 1 year ago

anoronh4 commented 1 year ago

I am trying to take IGV snapshots in a batch script to look at somatic insertions and deletion but most of them are buried in high coverage regions that become quite difficult to look at even when setting maxPanelHeight higher. is there any way to sort the reads so that those insertions and deletions are placed towards the top? or do i have to remove unwanted reads first?

jrobinso commented 1 year ago

Only if you know the base position of interest (base position indel). In that case a standard sort-by-base should place indels and snps at the top for the position specified.

jrobinso commented 1 year ago

Aligned length might work, but a deletion will result in a larger aligned length, and insertion a shorter one, so you couldn't do both at the same time. Also, there is currently not an option to reverse the sort (ascending / descending). However if you want to try this option the sort key is ALIGNED_READ_LENGTH

jrobinso commented 1 year ago

I did some experimenting, sorting by ALIGNED_READ_LENGTH works reasonably well for data for which read sequence lengths are constant, or nearly constant. In the process I added a "reverse" option to invert the sort, this is not available now but will be in the next release. In the absence of "reverse" alignments with deletions should be near the top, insertions should place alignments near the bottom. This is not perfect, other factors such as soft clipping can effect aligned read length. This will work much better for Illumina data, where read sequence lengths tend to be constant or nearly so, than it will for 3rd gen data.

Again, "reverse" is not a valid option for the current release, so delete those lines if you try this script.

new
genome hg19
load gs://genomics-public-data/platinum-genomes/bam/NA12877_S1.bam
# Sort by base
goto chr17:7,579,801
sort BASE chr17:7,579,801
snapshot sortByBase.png
sort BASE chr17:7,579,801 reverse
snapshot sortByBaseReverse.png
sort BASE reverse
snapshot sortByBaseReverse2.png
# Sort by aligned length at a deletion
goto chr17:7,579,651
sort ALIGNED_READ_LENGTH
snapshot sortByAlignedLength.png
sort ALIGNED_READ_LENGTH  reverse
snapshot sortByAlignedLengthReverse.png
# Sort by aligned length at an insertion
goto chr17:7,571,487
sort ALIGNED_READ_LENGTH
snapshot sortByAlignedLengthInsertion.png
sort ALIGNED_READ_LENGTH reverse
snapshot sortByAlignedLengthInsertionReverse.png
anoronh4 commented 1 year ago

thanks, that is helpful and works pretty well in many cases! looking forward to the new release.