hms-dbmi / UpSetR

An R implementation of the UpSet set visualization technique published by Lex, Gehlenborg, et al..
https://cran.rstudio.com/web/packages/UpSetR
Other
743 stars 147 forks source link

17 sets, how to display all intersections? #170

Open johnsolk opened 4 years ago

johnsolk commented 4 years ago

Hello, thank you for this useful package! I like the ability to condense many venn diagrams down into one visualization. I'm having a problem displaying all intersections between 17 sets. Here is my presence/absence matrix, 'pa'.

The example with 4 sets works, nicely showing a bar crossing all 4 sets:

upset(pa, sets = c(c("A_xenica","F_catanatus"),c("F_chrysotus","F_diaphanus")),mb.ratio = c(0.55, 0.45),keep.order = F)

4upsetr

But as soon as I move to 6, I am confused - there is no longer a bar crossing all 6 sets:

upset(pa, sets = c(c("A_xenica","F_catanatus","F_notatus"), c("F_chrysotus","F_diaphanus","F_nottii")),mb.ratio = c(0.55, 0.45),nsets = 6,keep.order = F)

6upsetr

17 sets does not work either:

upset(pa, sets = c(c("A_xenica","F_catanatus"),c("F_chrysotus","F_diaphanus"),c("F_grandis","F_heteroclitusMDPL"),c("F_heteroclitusMDPP","F_notatus"),c("F_nottii","F_olivaceous"),c("F_parvapinis","F_rathbuni"),c("F_sciadicus","F_similis"),c("F_zebrinus","L_goodei"),c("L_parva")),mb.ratio = c(0.55, 0.45),keep.order = F)

17_upsetr

The numbers do not match up. There should be a large group of genes in common between all 17 sets, as I have calculated and visualized badly with separate venn diagrams.

Can you please tell me how to modify my code so that all 17 intersections are considered? (Should have a bar with dots crossing each set.)

Thank you.

JakeConway commented 4 years ago

There could be some intersections that are size 0. To also show these use empty.intersections = TRUE

johnsolk commented 4 years ago

Thank you for your reply - This is true, there are some intersections with zero, although I'm not much interested in those. I am expecting large numbers (in the thousands) of intersections between all 17, 16, 15, etc. groups none of which I am not seeing in this plot.

Any insights into why all intersections are not shown, or ideas how to fix this in my upset command?

if(!file.exists('presence_absence.csv')){download.file('https://raw.githubusercontent.com/johnsolk/RNAseq_15killifish/master/evaluation/presence_absence.csv', 'presence_absence.csv')}
pa <- read.csv("presence_absence.csv")
pa <- pa[,-c(1)]
rownames(pa) <- pa$Ensembl
pa <- pa[,-c(1)]
test<- rowSums(pa)
test<- as.data.frame(test)
colnames(test) <- c("sum")
sum(test$sum == 17)
sum(test$sum == 16)
sum(test$sum == 15)
sum(test$sum == 14)

Output:

> sum(test$sum == 17)
[1] 10897
> sum(test$sum == 16)
[1] 2979
> sum(test$sum == 15)
[1] 1585
> sum(test$sum == 14)
[1] 1247
SergeyBaikal commented 3 years ago

I faced this problem too. I have not found a solution.

SergeyBaikal commented 3 years ago

And why are you using a command like this? upset(pa, sets = c(c("A_xenica","F_catanatus"),c("F_chrysotus","F_diaphanus")),mb.ratio = c(0.55, 0.45),keep.order = F)

Not all headings in one line? upset(pa, sets = c("A_xenica","F_catanatus", "F_chrysotus", "F_diaphanus"),mb.ratio = c(0.55, 0.45),keep.order = F)

SergeyBaikal commented 3 years ago

I think I found the answer. See Mode. https://jokergoo.github.io/ComplexHeatmap-reference/book/upset-plot.html

cmonat commented 1 year ago

Hello,

sorry for jumping into this conversation but I face a similar problem I think.

But first, on one hand, for me the empty.intersections=TRUE is not working. I mean the plot doesn't change and no other intersections are added ... I've tried empty.intersection = "on", empty.intersection = T, and empty.intersection = TRUE, non of them is working.

On the other hand, I saw the 'Mode' from ComplexHeatmap-reference but I cannot understand where to change the mode to put it on distinct, any idea how to do so?

Thank you in advance. Regards C.

fdchevalier commented 10 months ago

Hello everyone,

To solve this issue, the solution is to set nintersects = NA. This overwrite the default value of 40 (@johnsolk, you can see that there are only 40 intersects on your last two graphs).

@cmonat, this is very late but you had a missing s at the end of empty.intersections on your last 3 attempts. In addition, it looks like empty.intersections does not accept logical values.

Hope this helps.

Burak-Progenis commented 4 weeks ago

thanks @fdchevalier ! nintersects = NA solved the problem!

Do you also know how to extract list of intersecting genes? Thanks!

fdchevalier commented 4 weeks ago

Glad this was useful @Burak-Progenis.

I don't think there is a way to extract intersecting sets from the upsetR object. I usually do that manually, using the same dataset I provide for the upset plot. I usually have my features in row and sample in columns, and I search for features that are present in samples of interest.

Hope this helps.