Request: Add sample weighting

AMChalkie commented 9 years ago

It would be awesome to have sample weighting as an option on the fly.

genomematt commented 9 years ago

I second that. Weighting should be considered an essential step.

--=-=--=-=--=-=--=-=--=-=-- Matthew Wakefield wakefield@wehi.edu.au +61 402 916 018

On 17 Jul 2015, at 12:11 pm, Alistair Chalk notifications@github.com wrote:

It would be awesome to have sample weighting as an option on the fly.

― Reply to this email directly or view it on GitHub.

The information in this email is confidential and intended solely for the addressee. You must not disclose, forward, print or use it without the permission of the sender.

drpowell commented 9 years ago

Using manually specified weights? Or, something else?

AMChalkie commented 9 years ago

I'm thinking:

Nucleic Acids Res. http://www.ncbi.nlm.nih.gov/pubmed/25925576# 2015 Apr

pii: gkv412. [Epub ahead of print] Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses. Liu R http://www.ncbi.nlm.nih.gov/pubmed/?term=Liu%20R%5BAuthor%5D&cauthor=true&cauthor_uid=25925576 1, Holik AZ http://www.ncbi.nlm.nih.gov/pubmed/?term=Holik%20AZ%5BAuthor%5D&cauthor=true&cauthor_uid=25925576 2, Su S http://www.ncbi.nlm.nih.gov/pubmed/?term=Su%20S%5BAuthor%5D&cauthor=true&cauthor_uid=25925576 1, Jansz N http://www.ncbi.nlm.nih.gov/pubmed/?term=Jansz%20N%5BAuthor%5D&cauthor=true&cauthor_uid=25925576 3, Chen K http://www.ncbi.nlm.nih.gov/pubmed/?term=Chen%20K%5BAuthor%5D&cauthor=true&cauthor_uid=25925576 3, Leong HS http://www.ncbi.nlm.nih.gov/pubmed/?term=Leong%20HS%5BAuthor%5D&cauthor=true&cauthor_uid=25925576 3, Blewitt ME http://www.ncbi.nlm.nih.gov/pubmed/?term=Blewitt%20ME%5BAuthor%5D&cauthor=true&cauthor_uid=25925576 3, Asselin-Labat ML http://www.ncbi.nlm.nih.gov/pubmed/?term=Asselin-Labat%20ML%5BAuthor%5D&cauthor=true&cauthor_uid=25925576 2, Smyth GK http://www.ncbi.nlm.nih.gov/pubmed/?term=Smyth%20GK%5BAuthor%5D&cauthor=true&cauthor_uid=25925576 4, Ritchie ME http://www.ncbi.nlm.nih.gov/pubmed/?term=Ritchie%20ME%5BAuthor%5D&cauthor=true&cauthor_uid=25925576
Author information http://www.ncbi.nlm.nih.gov/pubmed/25925576# Abstract

Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean-variance relationship of the log-counts-per-million using 'voom'. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source 'limma' package.

On 21 July 2015 at 16:18, David Powell notifications@github.com wrote:

Using manually specified weights? Or, something else?

— Reply to this email directly or view it on GitHub https://github.com/Victorian-Bioinformatics-Consortium/degust/issues/25#issuecomment-123179539 .

Alistair Chalk Senior Research Officer

Stem Cell Regulation Unit St. Vincent's Institute

9 Princes St Fitzroy VIC 3065 Australia

Mobile: +61 (0)424 182 400 Fax: +61 3 9416 2676 Email: achalk@svi.edu.au

www.svi.edu.au http://www.researcherid.com/rid/B-3019-2008 ORCID: http://orcid.org/0000-0002-9630-6236 Skype: alistairchalk

"life is what happens while you're making other plans"

drpowell commented 9 years ago

Ahh, right - I recall Matt's talk at ABiC last year.

genomematt commented 9 years ago

You also missed my VLSCI 'how my collaborators got it wrong due to one outlier' talk

----- Original Message -----

From: "David Powell" notifications@github.com To: "Victorian-Bioinformatics-Consortium/degust" degust@noreply.github.com Cc: "Matthew Wakefield" wakefield@wehi.edu.au Sent: Tuesday, 21 July, 2015 4:28:55 PM Subject: Re: [degust] Request: Add sample weighting (#25)

Ahh, right - I recall Matt's talk at ABiC last year.

— Reply to this email directly or view it on GitHub .

The information in this email is confidential and intended solely for the addressee. You must not disclose, forward, print or use it without the permission of the sender.

drpowell commented 9 years ago

My invite must have got lost in the mail ;)

drpowell commented 9 years ago

Added 0a811f0fac5ec5edca0ed13a4f1e5e4d6b62c595. Still need to add bargraph of inferred sample weights

Victorian-Bioinformatics-Consortium / degust

Request: Add sample weighting #25