mixOmicsTeam / mixOmics

Development repository for the Bioconductor package 'mixOmics '
http://mixomics.org/
161 stars 52 forks source link

block.methods consensus plot #57

Closed mixOmicsTeam closed 4 years ago

mixOmicsTeam commented 4 years ago

Could we add the option for the multi omics integration methods (2+ datasets)

rep.space = 'consensus'

This would be the average of the components across all data sets.

Thanks Al :)

aljabadi commented 4 years ago

Consensus and Weighted Consensus plots added in https://github.com/mixOmicsTeam/mixOmics/commit/d1eb37052df5744fad5e0984a6bc1e0d686ef625

aljabadi commented 4 years ago

The following are now supported:

library(mixOmics)
data("breast.TCGA")
data = list(mrna = breast.TCGA$data.train$mrna, mirna = breast.TCGA$data.train$mirna,
            protein = breast.TCGA$data.train$protein)
design = matrix(1, ncol = length(data), nrow = length(data),
                dimnames = list(names(data), names(data)))
list.keepX = list(mrna = rep(20, 2), mirna = rep(10,2), protein = rep(10, 2))
TCGA.block.splsda = block.splsda(X = data, Y = breast.TCGA$data.train$subtype,
                                 ncomp = 2, keepX = list.keepX, design = design)

plotIndiv(TCGA.block.splsda, ind.names = FALSE, blocks ="consensus", ellipse = TRUE)
plotIndiv(TCGA.block.splsda, ind.names = FALSE, blocks = c("consensus", "weighted.consensus"), ellipse = TRUE)
plotIndiv(TCGA.block.splsda, ind.names = FALSE, blocks = c(names(data), c("consensus", "weighted.consensus")))

Example to demonstrate the effects of weights on consensus plots:

data("breast.TCGA")
data = list(mrna = breast.TCGA$data.train$mrna, mirna = breast.TCGA$data.train$mirna)
design = matrix(0, ncol = length(data), nrow = length(data),
                dimnames = list(names(data), names(data)))
list.keepX = lapply(data, function(x) c(2, 2))

## replace one dataset with noise so weights are benchmarked
data[2] <- lapply(data[2], FUN = function(x){
    matrix(rnorm(n = prod(dim(x))), nrow = nrow(x), dimnames = dimnames(x))
})

TCGA.block.splsda = block.splsda(X = data, Y = breast.TCGA$data.train$subtype,
                                 ncomp = 2, keepX = list.keepX, design = design)

## function to calculate median silhouette with for each class
## from plotIndiv()$df
consensus_silhouette <- function(diablo_plot) {
    median_class_silhouette <- cluster::silhouette(x = as.integer(diablo_plot$df$group), dist = dist(diablo_plot$df[,c("x", "y")]))
    summary(median_class_silhouette)$clus.avg.widths
}

## do the silhouette widths improve with weighted consensus?
consensus_silhouette(plotIndiv(TCGA.block.splsda, blocks =  "consensus"))
# 1         2         3 
# 0.2052980 0.1453035 0.2658106 
consensus_silhouette(plotIndiv(TCGA.block.splsda, blocks =  "weighted.consensus"))
# 1         2         3 
# 0.3695039 0.2766906 0.3446154