igvteam / igv-webapp

IGV Web App
https://igv.org/app
MIT License
114 stars 41 forks source link

Extracting insertion/deletion data from aligned region of interest #281

Open jsromanowski opened 6 months ago

jsromanowski commented 6 months ago

Hello IGV team,

I am currently analyzing amplicons from an insect genomic DNA sample edited with CRISPR/Cas9. I am aware IGV offers insertion and deletion data by selecting a specific base pair of your aligned read(s). Is it possible to access and export this data for a highlighted region of interest from IGV?

Side note: I am aware this data is contained within .bam files, however extracting the information in a clean way is not so simple and can get a bit messy (i.e. with mpileup offered by bcftools). I would also like to add I do not intend to use this data in a clinical or therapeutic context, but instead am using it to analyze gene editing in insects - so although I'm aware this web app is not particularly intended for this purpose, the ability to this data from IGV would be extremely useful and sufficient for my application.

Thanks,

Joe

jrobinso commented 6 months ago

Extracting the data with samtools is pretty trivial if you know the region

samtools view chr1:100-200

What information are you looking for, and in what format?

jsromanowski commented 6 months ago

With samtools view, I can see the alignment results (mismatches, insertions, deletions) and count the sum of any one of these (or all) over a given range (i.e. chr1:100-200), however I would lose base pair resolution of this information (i.e. if my region is chr1:100-200, I will not be able to see indel counts for chr1:100, chr1:101, chr1:102, and so on). By manually clicking a position of my aligned read on IGV web app, I can obtain this information, but it is not scalable for many alignment files - is there a faster, higher-throughput way to do this via IGV or some package offered by IGV?

Output format of this information is not particular important , but something simple such as columns for reference position, insertions, and deletions would be ideal.

Thanks for the quick response!

jrobinso commented 6 months ago

I would find it really surprising if there isn't a simple tool to do this, but I assume you have looked. This isn't the sort of thing we design IGV for, which is a visual tool, but leave this open we will consider it for the future.

jrobinso commented 6 months ago

BTW, on a tangent here, but where would you expect an insertion between positions 10 and 11 to be counted?

jsromanowski commented 6 months ago

Great question - I am exploring variant calling programs to count these. Out of curiosity, how does IGV do this?