Closed lwaldron closed 6 years ago
Levi, I'll try to function as a scribe.
On Sun, Oct 22, 2017 at 12:26 AM, Levi Waldron notifications@github.com wrote:
This SIG will discuss recent and needed Bioconductor data classes. Some recent or in-testing data classes to discuss are:
- MultiAssayExperiment (for "gluing" different types of assays together)
- RaggedExperiment (for copy number, mutations, or other data represented by different genomic ranges for each sample)
- restfulSE::RESTfulSummarizedExperiment, restfulSE:: BQSummarizedExperiment for remote storage + local interactive analysis of very large datasets
One presently identified need is a Bioconductor class for representing the drug sensitivity data from pharmacogenomics studies. Such studies, such as the Cancer Cell Line Encyclopedia (CCLE) and NCI-60, perform standard -omics assays, but also dose-response experiments where cell lines are subjected to varying doses of each of numerous compounds, and the responses are measured as cell viability. The resulting dose-response curves are then summarized using measures such as LC-50. The PharmacoGx https://bioconductor.org/packages/PharmacoGx/ Bioconductor package from the @bhaibeka https://github.com/bhaibeka lab provides numerous curated pharmacogenomics datasets as rich PharmacoSet objects, but these lack the flexibility and novel data storage models that would be available using a SummarizedExperiment-derived object for sensitivity data contained along with -omics assays within a MultiAssayExperiment. Therefore a desired outcome from this SIG is a draft class definition for cell line drug sensitivity data extending from SummarizedExperiment. This would accomplish both a needed new data class, and experience for those participating in extending existing core data structures to novel data types.
Topic leader: Levi Waldron @lwaldron https://github.com/lwaldron Scribe: Vincent Carey @vjcitn https://github.com/vjcitn (Vince can I volunteer you?)
Any interested participants are invited to use the issue to ask questions, suggest other relevant topics for discussion, and/or express their interest in participating.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Bioconductor/EuroBioc2017/issues/5, or mute the thread https://github.com/notifications/unsubscribe-auth/AEaOwv1RZzOxnM9Z1_COPtKH4nAJ5y5Fks5susP1gaJpZM4QBz8Z .
I strongly support this initiative of course. Many of these datasets are now available (see picture) and although PharmacoGx::PharmacoSet objects do their job, they do not deal efficiently with data access and storage.
@p-smirnov has deep experience with these pharmacogenomics datasets and would be interested in contributing.
Available datasets:
Hi Ben -- where is that image from? Public domain? I am working on a proposal that might benefit from the elegance. Thanks, Vince
On Mon, Oct 23, 2017 at 7:41 AM, Benjamin Haibe-Kains < notifications@github.com> wrote:
I strongly support this initiative of course. Many of these datasets are now available (see picture) and although PharmacoGx::PharmacoSet objects do their job, they do not deal efficiently with data access and storage.
@p-smirnov https://github.com/p-smirnov has deep experience with these pharmacogenomics datasets and would be interested in contributing.
Available datasets: [image: screen shot 2017-10-23 at 7 39 43 am] https://user-images.githubusercontent.com/594954/31887189-65623074-b7c5-11e7-95fb-f34814f0035c.png
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Bioconductor/EuroBioc2017/issues/5#issuecomment-338632197, or mute the thread https://github.com/notifications/unsubscribe-auth/AEaOwtW_MRUcnj4WONFgTsf-xCN3rlDOks5svHuDgaJpZM4QBz8Z .
I drew the picture from scratch, feel free to reuse. For more, you can borrow any slides from here: https://www.pmgenomics.ca/bhklab/research/presentations
I would like to attend, just waiting for confirmation from the conference about registration. It would be great for PharmacoGx
to leverage the MultiAssayExperiment
class for data storage and would better integrate our package into Bioconductor.
@p-smirnov haven't you received your invitation email yet?
@lgatto I searched through my email and found it last from last Friday. It was sorted out of my inbox so I missed seeing it.
Great you can come @p-smirnov, I'm really looking forward to it!
Initial agenda. Understood now from Laurent's comment below that we have four hours, 1-5pm. So here is a tentative schedule - I've scheduled more time for the pharmacogenomics component only because I know the measurable outcome to hopefully come from it, but certainly don't mind balancing if the VariantExperiment discussion needs more time.
SummarizedExperiment
. Want 1+ assays for summary measures like LC-50 (rows= compounds, columns=cell lines), but also want to store the complete dose-response data (for example, one assay with rows=compounds, columns=cell lines, 3rd dimension = dose, and a second assay with the third dimension = response?). What additional requirements would be added to SummarizedExperiment?PharmacoSet
in PharmacoGXOutcomes:
From how I read the schedule, I think we have two hours? Or can we extend this to use two sessions?
Yes, it's meant to from 1pm to 5 pm. We will be serving coffee at 3pm, but people are free to grab a cup and continue as they see fit.
I feel bad that I can't make it to this SIG. I guess it's not feasible for me to attend remotely? Looking forward to the minutes.
@lawremi @lawremi - nothing stops you form using hangouts and a google doc/etherpad for remote participation.
@lawremi you're willing to attend any of it between 1-5pm UK time (5-9am west coast time?), we'd certainly appreciate your presence.
Unfortunately I'll be in Australia and I think that's 12-4 AM so probably not. I'll at least be trying to sleep ;)
A gist providing some dose-viability data to play with. PDF output
source("https://gist.githubusercontent.com/lwaldron/ab3e6ab3ddc8815a01e3c46969aad130/raw/b85ab86c5b9ec3de1ec07d9dd33b1d01400edc29/FIMMdose-viability.R")
pset2se(fimm)
And some slides for pharmacogenomics and for on-disk data structures
Here's the link for the benchmarking work by Mike Smith we touched upon:
I had volunteered to be a scribe for this meeting. Very rudimentary notes are at
https://docs.google.com/document/d/15FWsVlQEGUTn5ys0GRL56ixOHzG04J7kq1IPMiRQyKM/edit?usp=sharing
On Mon, Dec 4, 2017 at 2:17 PM, Federico Marini notifications@github.com wrote:
Here's the link for the benchmarking work by Mike Smith we touched upon:
http://www.msmith.de/2017/11/17/10x-1/
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Bioconductor/EuroBioc2017/issues/5#issuecomment-348974381, or mute the thread https://github.com/notifications/unsubscribe-auth/AEaOwiUjMV6ShiHWxpOPyDdbxG-XGRkEks5s8_7ugaJpZM4QBz8Z .
@bhaibeka @p-smirnov @vjcitn want to continue this BOF at Bioc2018 in July?
This issue was moved to Bioconductor/BioC2018#8
This SIG will discuss recent and needed Bioconductor data classes. Some recent or in-testing data classes to discuss are:
MultiAssayExperiment
(for "gluing" different types of assays together)RaggedExperiment
(for copy number, mutations, or other data represented by different genomic ranges for each sample)restfulSE::RESTfulSummarizedExperiment
,restfulSE::BQSummarizedExperiment
for remote storage + local interactive analysis of very large datasetsOne presently identified need is a Bioconductor class for representing the drug sensitivity data from pharmacogenomics studies such as the Cancer Cell Line Encyclopedia (CCLE) and NCI-60. These studies perform standard -omics assays, but also dose-response experiments where cell lines are subjected to varying doses of each of numerous compounds. Responses are measured as cell viability, and the resulting dose-response curves are summarized using measures such as LC-50. The full dose-response data are a 3-D array (dose x time x cell line), which should be stored in addition to summary measure matrices (e.g. LC-50 concentration x cell line) The PharmacoGx Bioconductor package from the @bhaibeka lab provides numerous curated pharmacogenomics datasets as rich
PharmacoSet
objects, but these lack the flexibility and novel data storage models that would be available using aSummarizedExperiment
-derived object for sensitivity data contained along with -omics assays within aMultiAssayExperiment
. Therefore a desired outcome from this SIG is a draft class definition for cell line drug sensitivity data extending fromSummarizedExperiment
. This would accomplish both a needed new data class, and experience for those participating in extending existing core data structures to novel data types.Topic leader: Levi Waldron @lwaldron Scribe: Vincent Carey @vjcitn (Vince can I volunteer you?)
Any interested participants are invited to use the issue to ask questions, suggest other relevant topics for discussion, and/or express their interest in participating.