Spoke w/ @timodonnell via email. Asked to post an issue
varlens-reads can be used to subset an input .bam containing only those reads that overlap specific coordinate flags (including coordinates provided by .vcf files). A newly created .bam can be refined to include only those reads that overlap a locus, and meet certain additional filters such as strandedness or MapQ scores
I'm wondering whether it would be possible to also add the capability to varlens (or Isovar for that matter) to subset an input .bam to only those reads overlapping a specified locus, and containing a specified base at said locus. For example
Output result.bam would consist of all reads that overlap Chr22:46932040-46932041 (irrespective of start/end sites), and contain the specified base 'A' at this position. If possible, the process could also accept coordinates and bases supplied by a.vcf. However, in when a .vcf is suppled the input result.bam contains those only reads that contain the ALT base, at loci specified by the .vcf
The use-case I have in mind is to generate allele-specific coverage tracks for a particular locus (or set of loci) to compare across treatment conditions. The purpose being facile comparison of--for example--the locus of a putative neoantigen locus pre-/post-treatement showing an immuno-editing event.
The closest example of what I'm thinking of would be the typical CHIP-Seq peak data visualized using GenomeBrowser (e.g. see below), where comparison of multiple pileup "tracks" each corresponding to treatment groups, where a gain/loss of a peak in one/more tracks is indicative of some biologic process.
Spoke w/ @timodonnell via email. Asked to post an issue
varlens-reads
can be used to subset an input.bam
containing only those reads that overlap specific coordinate flags (including coordinates provided by.vcf
files). A newly created.bam
can be refined to include only those reads that overlap a locus, and meet certain additional filters such as strandedness or MapQ scoresPer the documentation:
I'm wondering whether it would be possible to also add the capability to varlens (or Isovar for that matter) to subset an input
.bam
to only those reads overlapping a specified locus, and containing a specified base at said locus. For exampleOutput
result.bam
would consist of all reads that overlap Chr22:46932040-46932041 (irrespective of start/end sites), and contain the specified base 'A' at this position. If possible, the process could also accept coordinates and bases supplied by a.vcf
. However, in when a.vcf
is suppled the inputresult.bam
contains those only reads that contain the ALT base, at loci specified by the.vcf
The use-case I have in mind is to generate allele-specific coverage tracks for a particular locus (or set of loci) to compare across treatment conditions. The purpose being facile comparison of--for example--the locus of a putative neoantigen locus pre-/post-treatement showing an immuno-editing event.
The closest example of what I'm thinking of would be the typical CHIP-Seq peak data visualized using GenomeBrowser (e.g. see below), where comparison of multiple pileup "tracks" each corresponding to treatment groups, where a gain/loss of a peak in one/more tracks is indicative of some biologic process.
http://www.bloodjournal.org/content/bloodjournal/127/24/2991/F2.large.jpg?width=800&height=600&carousel=1