hammerlab / cycledash

Variant Caller Analysis Dashboard and Data Management System
Other
35 stars 2 forks source link

RNASeq expression worker #853

Open ihodes opened 8 years ago

ihodes commented 8 years ago

@iskandr:

PGV relies on allele-specific quantification of expression from RNAseq, how can that data be shown in Cycledash?

from https://github.com/hammerlab/cycledash/issues/852

arahuja commented 8 years ago

Some near-term ways to address this might be:

ihodes commented 8 years ago

cf. https://github.com/hammerlab/cycledash/issues/790 for multiple BAMs in pileup

allow an RNA BAM to be specified on run submission

Right now you can submit as many BAMs as you'd like; would you want a tag on a BAM designating it as RNA data?

arahuja commented 8 years ago

Yes, you can submit any number of BAMs, but you can't submit a run with multiple tumor BAMs:

image

But assuming the "multiple BAMs in a pileup" feature allows you choose any other BAM from the project, I guess it doesn't matter too much. But, I could also see you wanting to tie a run to a specific RNA BAM to get the allele specific stats. Tagging a BAM as RNA also makes sense.

ihodes commented 8 years ago

That makes sense; thanks for the explanation!

armish commented 8 years ago

@iskandr: what is the summary statistics that you would like to show if we have the RNA-bam feature implemented? Some measure of within-sample normalized expression for a variant? Or simply number of reads supporting that particular variant?

Let me know if this needs an off-line discussion for a through discussion.

iskandr commented 8 years ago

I think that (1) the number of reads that contain a variant (2) the number of reads overlapping the locus and (3) the normalized expression value for that variant would all be good to see. What do you think?

armish commented 8 years ago

1) Definitely a go-go; 2) We have to define what a locus is for our purposes. Do you mean counting all reads falling into the gene locus for the corresponding transcript? 3) Would be great to have that, but I first have to read on normalizing expression within a single sample. Did your manual PGV pipeline have this already?

armish commented 8 years ago

@ihodes: OK for me to start working on this? I am planning to add an RNASeq BAM file field to runs as a first step just to quickly get it done. I think we can ideally have a more flexible BAM submission for users where they can attach any number of BAMs with a controlled label (e.g. Normal, Tumor, RNASeq, etc.) but that can be our next goal. What do you think?

ihodes commented 8 years ago

@armish I'd defer to @iskandr as to what he needs—I had the impression we're waiting on a new tool he's developing next week, and that existing tools wouldn't get us what we need.