hms-dbmi / UpSetR

An R implementation of the UpSet set visualization technique published by Lex, Gehlenborg, et al..
https://cran.rstudio.com/web/packages/UpSetR
Other
764 stars 156 forks source link

17 sets, how to display all intersections? #170

Open johnsolk opened 5 years ago

johnsolk commented 5 years ago

Hello, thank you for this useful package! I like the ability to condense many venn diagrams down into one visualization. I'm having a problem displaying all intersections between 17 sets. Here is my presence/absence matrix, 'pa'.

The example with 4 sets works, nicely showing a bar crossing all 4 sets:

upset(pa, sets = c(c("A_xenica","F_catanatus"),c("F_chrysotus","F_diaphanus")),mb.ratio = c(0.55, 0.45),keep.order = F)

4upsetr

But as soon as I move to 6, I am confused - there is no longer a bar crossing all 6 sets:

upset(pa, sets = c(c("A_xenica","F_catanatus","F_notatus"), c("F_chrysotus","F_diaphanus","F_nottii")),mb.ratio = c(0.55, 0.45),nsets = 6,keep.order = F)

6upsetr

17 sets does not work either:

upset(pa, sets = c(c("A_xenica","F_catanatus"),c("F_chrysotus","F_diaphanus"),c("F_grandis","F_heteroclitusMDPL"),c("F_heteroclitusMDPP","F_notatus"),c("F_nottii","F_olivaceous"),c("F_parvapinis","F_rathbuni"),c("F_sciadicus","F_similis"),c("F_zebrinus","L_goodei"),c("L_parva")),mb.ratio = c(0.55, 0.45),keep.order = F)

17_upsetr

The numbers do not match up. There should be a large group of genes in common between all 17 sets, as I have calculated and visualized badly with separate venn diagrams.

Can you please tell me how to modify my code so that all 17 intersections are considered? (Should have a bar with dots crossing each set.)

Thank you.

JakeConway commented 5 years ago

There could be some intersections that are size 0. To also show these use empty.intersections = TRUE

johnsolk commented 5 years ago

Thank you for your reply - This is true, there are some intersections with zero, although I'm not much interested in those. I am expecting large numbers (in the thousands) of intersections between all 17, 16, 15, etc. groups none of which I am not seeing in this plot.

Any insights into why all intersections are not shown, or ideas how to fix this in my upset command?

if(!file.exists('presence_absence.csv')){download.file('https://raw.githubusercontent.com/johnsolk/RNAseq_15killifish/master/evaluation/presence_absence.csv', 'presence_absence.csv')}
pa <- read.csv("presence_absence.csv")
pa <- pa[,-c(1)]
rownames(pa) <- pa$Ensembl
pa <- pa[,-c(1)]
test<- rowSums(pa)
test<- as.data.frame(test)
colnames(test) <- c("sum")
sum(test$sum == 17)
sum(test$sum == 16)
sum(test$sum == 15)
sum(test$sum == 14)

Output:

> sum(test$sum == 17)
[1] 10897
> sum(test$sum == 16)
[1] 2979
> sum(test$sum == 15)
[1] 1585
> sum(test$sum == 14)
[1] 1247
SergeyBaikal commented 3 years ago

I faced this problem too. I have not found a solution.

SergeyBaikal commented 3 years ago

And why are you using a command like this? upset(pa, sets = c(c("A_xenica","F_catanatus"),c("F_chrysotus","F_diaphanus")),mb.ratio = c(0.55, 0.45),keep.order = F)

Not all headings in one line? upset(pa, sets = c("A_xenica","F_catanatus", "F_chrysotus", "F_diaphanus"),mb.ratio = c(0.55, 0.45),keep.order = F)

SergeyBaikal commented 3 years ago

I think I found the answer. See Mode. https://jokergoo.github.io/ComplexHeatmap-reference/book/upset-plot.html

cmonat commented 2 years ago

Hello,

sorry for jumping into this conversation but I face a similar problem I think.

But first, on one hand, for me the empty.intersections=TRUE is not working. I mean the plot doesn't change and no other intersections are added ... I've tried empty.intersection = "on", empty.intersection = T, and empty.intersection = TRUE, non of them is working.

On the other hand, I saw the 'Mode' from ComplexHeatmap-reference but I cannot understand where to change the mode to put it on distinct, any idea how to do so?

Thank you in advance. Regards C.

fdchevalier commented 1 year ago

Hello everyone,

To solve this issue, the solution is to set nintersects = NA. This overwrite the default value of 40 (@johnsolk, you can see that there are only 40 intersects on your last two graphs).

@cmonat, this is very late but you had a missing s at the end of empty.intersections on your last 3 attempts. In addition, it looks like empty.intersections does not accept logical values.

Hope this helps.

Burak-Progenis commented 5 months ago

thanks @fdchevalier ! nintersects = NA solved the problem!

Do you also know how to extract list of intersecting genes? Thanks!

fdchevalier commented 5 months ago

Glad this was useful @Burak-Progenis.

I don't think there is a way to extract intersecting sets from the upsetR object. I usually do that manually, using the same dataset I provide for the upset plot. I usually have my features in row and sample in columns, and I search for features that are present in samples of interest.

Hope this helps.