Closed ChristopherEeles closed 2 years ago
Dear Christopher,
Thank you very much for your very detailed bug report. The problem was in the normalization step, since right before the cbind
call always returns a matrix, and that normalization step was dropping the dimensions when there was only a single column (sample) as in your example. So, just adding drop=FALSE
in the associated subsetting operation fixed the problem, which has been pushed to the 1.45.1 devel version and 1.44.1 release version and should become available via BiocManager::install()
in the next 24/48 hours. Please let me know if this fixes completely your use case or additional fixes are needed.
Hi @rcastelo,
Thanks for the prompt fix. Everything is working as expected now with 1.45.1.
Best, Chris
Hi @rcastelo,
Trying to use the
GSVA
package to put univariate drug response expression biomarkers into the pathway context for use in molecular tumour boards. Our use case allows a researcher or clinician to upload the expression profile (logTPM or RMA normalized microarray) of a patient and use the data to query PharmacoDB.ca for a list of biomarkers associated with response to a given drug therapy from in vitro high throughput drug screening studies in cancer cell lines.We want to allow clinicians to take these univariate biomarkers and put them into a pathway context to see if the gene associated with drug response is in a pathway known to be involved in their patient's cancer or cancer more generally. To this end, we are trying to use the ssGSEA method via GSVA to identify relevant gene sets from a list we compiled via
msigdbr
.I have found a bug, in the sense that I can reproducibly cause an error in the package by passing a single column matrix into the GSVA method. However, it is not clear to me from the package vignette or a brief review of Barbie _et _al.__ (2009) if the method is actually intended to work with a literal single patient's gene expression profile.
Minimal Reprex
The actual bug is caused because the matrix gets simplified to a vector before trying to assign rownames on line 885 of gsva.R, inside the
ssgsea
function, and could be fixed by convertinges
, which is a vector when the bug occurs, into a single column matrix with something likeif (n == 1 && !is.matrix(es)) es <- matrix(es, ncol=1, dimnames=list(names(es), colnames(X))
.It is worth noting that the bug doesn't occur when
length(geneSets) == 1
, since the condition is already convertinges
back to being a matrix.I wanted to confirm that the method is able to handle enrichment analysis using only one patient profile before I open a PR with a fix.
If the method does support the single patient case and you would like a PR please let me know. If it does not and/or you have suggestions for an alternative approach it would be appreciated.
We actually already have a "reference" gene expression population that the user uploads to identify genes of interest, so maybe we could add the mean of that population as a reference population.
R Environment
Best, Christopher Eeles Software Developer Haibe-Kains Lab | PM-Research University Health Network