Open cmdcolin opened 2 years ago
could also think about multibigwig/strandedplot https://github.com/bhofmei/jbplugin-strandedplot style results but this request is more about automatically calculating from the reads themselves
example image from twitter thread
For showing the overall coverage, I expect this would involve adding an additional fwd/reverse feature filter:
My guess:
const feats = [...features.values()]
const coverage_fwd = feats.filter( f => (f.get('type') !== 'skip') && (f.get('strand') == 1) )
const coverage_rev = feats.filter( f => (f.get('type') !== 'skip') && (f.get('strand') == -1) )
const skips_fwd = feats.filter( (f => f.get('type') === 'skip') && (f.get('strand') == 1) )
const skips_rev = feats.filter( (f => f.get('type') === 'skip') && (f.get('strand') == -1) )
// Use two pass rendering, which helps in visualizing the SNPs at higher
// bpPerPx First pass: draw the gray background
ctx.fillStyle = colorForBase.total
// forward reads (shown above the Y axis)
for (let i = 0; i < coverage_fwd.length; i++) {
const feature = coverage_fwd[i]
const [leftPx, rightPx] = featureSpanPx(feature, region, bpPerPx)
const w = rightPx - leftPx + 0.3
const score = feature.get('score') as number
ctx.fillRect(leftPx, toY(score), w, toHeight(score))
}
// reverse reads (shown below the Y axis)
for (let i = 0; i < coverage_rev.length; i++) {
const feature = coverage_rev[i]
const [leftPx, rightPx] = featureSpanPx(feature, region, bpPerPx)
const w = rightPx - leftPx + 0.3
const score = feature.get('score') as number
ctx.fillRect(leftPx, toY(score), w, toY - toHeight(score))
}
[if the filtered coverage tracks can be assumed to be the same length, then these rectangles could be done within the same loop]
Here is a reference assembly and BAM file that can be used for testing (the same one that I used in the JBrowse2 picture). This BAM file is useful because it shows stranded long reads with a very high dynamic range in coverage, so the log representation is necessary to fully appreciate the expression pattern:
GRCm39.primary_assembly.chrM.fa.gz
mm2_GCFeb22_S14_called_BC07_vs_chrM.bam.gz
edit: updated BAM file
Should look something like this:
I needed to dig down to the SNPCoverageAdapter bin function to create additional scores for forward and reverse strand, in order to get this working. It's also a bastardisation of the scale (as you may be able to tell by looking at the Y axis).
... although it does seem to work well on log scale as well:
great initiative diving into the codebase:)! could possibly use negative values but those will not log transform well...will probably need a real notion of having multiple signals in a single track a la multibigwig and https://github.com/bhofmei/jbplugin-strandedplot
I've attached a diff (based on commit 85973e53). This creates a new StrandedRelCoverage
Display Type and associated adapters (based on LinearSNPCoverage
), which has a stranded coverage track and stranded variant display.
I suppose this would be better designed as a modification of the existing LinearSNPCoverage
track, but I didn't want to break existing functionality when creating something different.
patch-stranded-snp-coverage.diff.gz
Note that in the highlighted column, the 'C' variants are broken by an 'A' variant in the middle on the opposite strand. This could be improved (e.g. by sorting variants based on their frequency), but I think this is at least good enough for others to understand what I'd like to achieve.
Another +1
for this from @haessar today. He said, I believe, that he uses differences in coverage between strands to help with annotating UTRs. Here's a screenshot of this feature Artemis:
definitely a good use case there. could possibly leverage multiwiggle display for this purpose (as they are conceptually two different signals)
from thread here https://twitter.com/gringene_bio/status/1505780926734774275
would create separate coverage for positive and negative strand reads