upsetjs / upsetjs_r

😠 htmlwidget R bindings for UpSet.js for rendering UpSet plots, Euler, and Venn Diagrams
https://upset.js.org/integrations/r
Other
36 stars 2 forks source link

Accept binary matrix input or create helper function for conversion #10

Closed moldach closed 4 years ago

moldach commented 4 years ago

A number of tools, like SURVIVOR, create binary matrix. Tools like ComplexHeatmap's UpSet plot can either accept input from A) lists of sets (like upsetjs_r) or B) binary matrix.

It would be nice if upsetjs_r could either accept input from binary matrices or contain a helper function to convert binary matrix data.

I don't believe ComplexHeatmap (or any other package I know of) has a function to do this at the moment: https://support.bioconductor.org/p/131616/

sgratzl commented 4 years ago

can you provide a sample dataset, please?

moldach commented 4 years ago

Sure:

library(tibble)
mat <- tribble(
  ~set1, ~set2, ~set3,
   1,   1,   0,
   0,   0,   1,
   0,   1,   1,
   0,   0,   1,
   0,   0,   1,
   0,   1,   1,
   1,   0,   1,
   0,   1,   1,
   0,   0,   1,
   0,   0,   1,
   1,   1,   1,
   1,   0,   0,
   0,   0,   1,
   0,   1,   0,
   1,   1,   1,
   0,   1,   0,
   0,   1,   1,
   0,   1,   0,
   0,   0,   1,
   0,   0,   1
)

So either accept that type of data or a helper function to convert it into a format that upsetjs accepts, e.g.:

set.seed(123)
listInput = list(a = sample(letters, 5),
          b = sample(letters, 10),
          c = sample(letters, 15))
upsetjs() %>% fromList(listInput) %>% interactiveChart()
sgratzl commented 4 years ago

https://upset.js.org/integrations/r/articles/basic.html#data-frame-input doesn't work?

moldach commented 4 years ago

Sorry I never saw that documentation - I still cannot find a link for it on the README on the github repo.

One more thing, I've noticed the following discrepancy.

library(tibble)
t <- tribble(
  ~set1, ~set2, ~set3,
   1,   1,   0,
   0,   0,   1,
   0,   1,   1,
   0,   0,   1,
   0,   0,   1,
   0,   1,   1,
   1,   0,   1,
   0,   1,   1,
   0,   0,   1,
   0,   0,   1,
   1,   1,   1,
   1,   0,   0,
   0,   0,   1,
   0,   1,   0,
   1,   1,   1,
   0,   1,   0,
   0,   1,   1,
   0,   1,   0,
   0,   0,   1,
   0,   0,   1
)

Both VennDiagram and ComplexHeatmap produce the same result but upsetjs was producing a different image.

VennDiagram

library(VennDiagram)
venn.diagram(list(Set1=which(t[,1]==1), 
                  Set2=which(t[,2]==1), 
                  Set3=which(t[,3]==1)), 
             fill = c("#DDAA33", "#BB5566" ,"#004488"), 
             alpha = c(0.5, 0.5, 0.5), cex = 2, lty =2, 
             filename = "VennDiagram.tiff")

venn

ComplexHeatmap

m = make_comb_mat(t)
ss = set_size(m)
UpSet(m, set_order = order(ss), comb_order = order(-comb_size(m)))

Rplot01

UpSetJS

upsetjs() %>% fromDataFrame(t)

Rplot03

According to ComplexHeatmap's documentation:

intersect mode: 1 means in that set and 0 is not taken into account, then, 1 1 0 means a set of elements in set A and B, and they can also in C or not in C (intersect(A, B)). Under this mode, the seven combination sets can overlap.

so,

m = make_comb_mat(t, mode="intersect")
ss = set_size(m)
UpSet(m, 
           set_order = order(ss),
           #comb_order = order(comb_degree(m), -cs),
           comb_order = order(-comb_size(m)))

Rplot03

So for ComplexHeatmap the default is distinct

distinct mode: 1 means in that set and 0 means not in that set, then 1 1 0 means a set of elements both in set A and B, while not in C (setdiff(intersect(A, B), C)). Under this mode, the seven combination sets are the seven partitions in the Venn diagram and they are mutually exclusive.

How can you mimic the distinct output produced by VennDiagram and ComplexHeatmap with UpSetJS?

sgratzl commented 4 years ago

so to understand it correctly

m = make_comb_mat(t) ss = set_size(m) UpSet(m, set_order = order(ss), comb_order = order(-comb_size(m)))

if you label the vertical bars in this chart it would read the following:

right?

jokergoo commented 4 years ago

E.g., in ComplexHeatmap, for a pattern

A B C
x x -

the "distinct" mode corresponds to setdiff(intersect(A, B), C) and the "intersect" mode corresponds to intersect(A, B),

and similar for the pattern:

A B C
x - -

the "distinct" mode corresponds to setdiff(setdiff(A, B), C), or setdiff(A, union(B, C)), the "intersect" mode corresponds to intersect(A) which is A.

You can find the graphical explanations here:

image

moldach commented 4 years ago

Thank you for clarification @jokergoo

sgratzl commented 4 years ago

thx. I just implemented a generateDistinctIntersections() function.

can you verify that https://upset.js.org/next/integrations/r/articles/combinationModes.html is now showing the values you would expect in the different modes?

moldach commented 4 years ago

That's awesome @sgratzl , thanks for being so quick with this enhancement!

The function may not have been included in the most recent build though.

I deleted the folder from R/win-library/4.0 and tried to reinstall via: devtools::install_url("https://github.com/upsetjs/upsetjs_r/releases/latest/download/upsetjs.tar.gz")

but I get an error:

t <- read.csv("data/survivor_comparison_matrix.csv")
library(upsetjs)
upsetjs() %>% 
  fromDataFrame(t) %>% 
  generateDistinctIntersections() %>%   
  interactiveChart()

Error in generateDistinctIntersections(.) : could not find function "generateDistinctIntersections"

sgratzl commented 4 years ago

yeah it is not released yet but in the develop branch. You can download a current build e.g. from https://github.com/upsetjs/upsetjs.github.io/raw/master/next/integrations/r/upsetjs.tar.gz

sgratzl commented 4 years ago

version 1.1.0 has been released containing the generateDistinctIntersections function