dusadrian / venn

Draw Venn Diagrams
30 stars 7 forks source link

Access intersection data of Venn diagram #24

Closed chrissy005 closed 1 year ago

chrissy005 commented 1 year ago

I have used the Venn package to create a venn diagram with 6 sets

However, I am unable to access the intersection data between the sets.

The following is the code that I have used.

venn.95 <- venn(core.genus.all.clusters.95,zcolor = c("slateblue","chocolate1","lawngreen","palevioletred3","gold","turquoise3"), ggplot = TRUE, # Create venn diagram with dotted lines linetype = "dotted")

Could you advise on how may I access the intersection data in the Venn diagram?

mmahmoudian commented 1 year ago

Can you elaborate more and explain in more details what do you mean by accessing the intersections?

chrissy005 commented 1 year ago

venn.core.genus.95.pdf

Attached is the venn diagram created. I would like to know the specific parts of the elements that are shared or common between the 6 groups apart form just getting the counts.

dusadrian commented 1 year ago

Dear @chrissy005, we need a (minimal) replicable example to figure out exactly what you use as an input, and only you know what the object "core.genus.all.clusters.95" contains.

To show it is indeed possible to get what (I think) you are looking for, in the examples of function venn(), you can find one that uses a list as an input:

set.seed(12345)
x <- list(First = 1:20, Second = 10:30, Third = sample(25:50, 15))

x
# $First
#  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
# 
# $Second
#  [1] 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
# 
# $Third
#  [1] 38 43 40 35 26 47 30 31 34 41 48 32 49 44 25

There are elements specific to only one set, while other elements are common between at least two sets etc. The following code returns the intersections:

y <- venn(x, ilabels = "counts")

y
#                    First Second Third counts
#                        0      0     0      0
# Third                  0      0     1     12
# Second                 0      1     0      7
# Second:Third           0      1     1      3
# First                  1      0     0      9
# First:Third            1      0     1      0
# First:Second           1      1     0     11
# First:Second:Third     1      1     1      0

For instance there are three elements that are common between the second and the third sets, as per:

intersect(x[[2]], x[[3]])
# [1] 25 26 30

Is this what you are looking after? Adrian

mmahmoudian commented 1 year ago

@dusadrian Are you interested in having an additiona function in venn package to produce a list of all intersections? I already have written a function that just needs some cleaning up. I can create a PR if you are interested.

dusadrian commented 1 year ago

Hello @mmahmoudian, please find the latest commit with a new function extractInfo(). Hope this solves both remaining issues.

chrissy005 commented 1 year ago

Dear @dusadrian and @mmahmoudian,

I apologize for not providing enough information on the issue.

The object used to construct the venn diagram (core.genus.all.clusters.95) is a list of character vectors (Bacterial genera) as follows:

$North.Thailand [1] "AAP99" "Ahniella"
[3] "Calenema" "Calothrix PCC-6303"
[5] "Candidatus Chloroploca" "Chloracidobacterium"
[7] "Chloroflexus" "Chthonomonas"
[9] "Desulfomicrobium" "Elioraea"
[11] "Fimbriiglobus" "GBChlB"
[13] "Meiothermus" "Raineya"
[15] "Roseiflexus" "Rubritepida"
[17] "Thermus" "Unassigned A4b (Family)"
[19] "Unassigned Armatimonadota (Phylum)" "Unassigned Bacteria (Kingdom)"
[21] "Unassigned Bacteroidia (Class)" "Unassigned Cyanobacteriales (Order)"
[23] "Unassigned Gemmataceae (Family)" "Unassigned Hydrogenophilaceae (Family)" [25] "Unassigned Phormidiaceae (Family)" "Unassigned Pseudanabaenaceae (Family)" [27] "Unassigned RBG-13-54-9 (Order)" "Unassigned RD017 (Order)"
[29] "Unassigned Saprospiraceae (Family)" "Unassigned SBR1031 (Order)"
[31] "Unassigned WD2101 soil group (Family)" "Venenivibrio"

$North.Malaysia [1] "Ahniella" "Candidatus Chloroploca"
[3] "Candidatus Gloeomargarita" "Candidatus Xiphinematobacter"
[5] "Chloracidobacterium" "Chloroflexus"
[7] "Chthonomonas" "Curvibacter"
[9] "Elioraea" "Exilispira"
[11] "GBChlB" "Gemmata"
[13] "Ignavibacterium" "IheB3-7"
[15] "Meiothermus" "Methylothermus"
[17] "Raineya" "Rhodovarius"
[19] "Roseiflexus" "Rubritepida"
[21] "Sandaracinobacter" "Telmatocola"
[23] "Tepidimonas" "Unassigned A4b (Family)"
[25] "Unassigned Acetothermiia (Class)" "Unassigned Acidobacteriae (Class)"
[27] "Unassigned Anaerolineaceae (Family)" "Unassigned Armatimonadota (Phylum)"
[29] "Unassigned Bacteria (Kingdom)" "Unassigned Bacteroidetes VC2.1 Bac22 (Order)" [31] "Unassigned Bacteroidia (Class)" "Unassigned Burkholderiales (Order)"
[33] "Unassigned Chitinophagales (Order)" "Unassigned Eurycoccales (Order)"
[35] "Unassigned Gemmataceae (Family)" "Unassigned Hydrogenophilaceae (Family)"
[37] "Unassigned Kapabacteriales (Order)" "Unassigned Microscillaceae (Family)"
[39] "Unassigned Paludibaculum (Order)" "Unassigned Phormidiaceae (Family)"
[41] "Unassigned Planctomycetes (Class)" "Unassigned Pseudanabaenaceae (Family)"
[43] "Unassigned RBG-13-54-9 (Order)" "Unassigned RD017 (Order)"
[45] "Unassigned Saprospiraceae (Family)" "Unassigned SBR1031 (Order)"
[47] "Unassigned Sphingobacteriales (Order)" "Unassigned Sva0485 (Phylum)"
[49] "Unassigned vadinHA49 (Class)" "Unassigned WD2101 soil group (Family)"

$South.Malaysia [1] "AAP99" "Ahniella"
[3] "Calenema" "Calothrix PCC-6303"
[5] "Candidatus Chloroploca" "Candidatus Gloeomargarita"
[7] "Chloracidobacterium" "Chthonomonas"
[9] "Cytophaga" "DSSF69"
[11] "Elioraea" "GBChlB"
[13] "Gemmata" "Ignavibacterium"
[15] "Leptolyngbya ANT.L52.2" "Meiothermus"
[17] "Phaselicystis" "Raineya"
[19] "Rhodovarius" "Rivibacter"
[21] "Roseiflexus" "Rubritepida"
[23] "Sandaracinobacter" "Telmatocola"
[25] "Tepidimonas" "Thermoflexibacter"
[27] "Turneriella" "Unassigned A4b (Family)"
[29] "Unassigned Acidobacteriae (Class)" "Unassigned Anaerolineaceae (Family)"
[31] "Unassigned Armatimonadota (Phylum)" "Unassigned Bacteria (Kingdom)"
[33] "Unassigned Bacteroidia (Class)" "Unassigned Chitinophagales (Order)"
[35] "Unassigned Cytophagales (Order)" "Unassigned Flavobacteriales (Order)"
[37] "Unassigned Gemmataceae (Family)" "Unassigned Kapabacteriales (Order)"
[39] "Unassigned mle1-27 (Order)" "Unassigned Phormidiaceae (Family)"
[41] "Unassigned Planctomycetes (Class)" "Unassigned Pseudanabaenaceae (Family)" [43] "Unassigned RBG-13-54-9 (Order)" "Unassigned RD017 (Order)"
[45] "Unassigned Rhodanobacteraceae (Family)" "Unassigned S-BQ2-57 soil group (Order)" [47] "Unassigned Saprospiraceae (Family)" "Unassigned SJA-28 (Order)"
[49] "Unassigned SM1A07 (Order)" "Unassigned Sutterellaceae (Family)"
[51] "Unassigned WD2101 soil group (Family)"

$Singapore [1] "Ahniella" "Caldimonas"
[3] "Candidatus Alysiosphaera" "Candidatus Chloroploca"
[5] "Chloroflexus" "Curvibacter"
[7] "Cytophaga" "Defluviicoccus"
[9] "DSSD61" "Elioraea"
[11] "Fischerella PCC-9339" "Geitlerinema PCC-8501"
[13] "Gemmata" "Ignavibacterium"
[15] "Leptolyngbya ANT.L52.2" "Meiothermus"
[17] "Methylothermus" "Microvirga"
[19] "MTP1" "Raineya"
[21] "Rivibacter" "Roseiflexus"
[23] "Sandaracinobacter" "SM1A02"
[25] "Streptococcus" "Telmatocola"
[27] "Tepidimonas" "Thermodesulfovibrio"
[29] "Thermoflexibacter" "Thermosynechococcus BP-1"
[31] "Turneriella" "Unassigned 11-24 (Order)"
[33] "Unassigned A4b (Family)" "Unassigned Acetothermiia (Class)"
[35] "Unassigned Acidobacteriae (Class)" "Unassigned Alphaproteobacteria (Class)"
[37] "Unassigned Anaerolineaceae (Family)" "Unassigned Armatimonadota (Phylum)"
[39] "Unassigned Bacteria (Kingdom)" "Unassigned Bacteroidia (Class)"
[41] "Unassigned BSV26 (Family)" "Unassigned Chloroflexaceae (Family)"
[43] "Unassigned Cytophagales (Order)" "Unassigned Gammaproteobacteria (Class)"
[45] "Unassigned Gemmataceae (Family)" "Unassigned Gemmatimonadaceae (Family)"
[47] "Unassigned Hydrogenophilaceae (Family)" "Unassigned Kapabacteriales (Order)"
[49] "Unassigned Methylacidiphilaceae (Family)" "Unassigned Microscillaceae (Family)"
[51] "Unassigned mle1-27 (Order)" "Unassigned Myxococcaceae (Family)"
[53] "Unassigned Paludibaculum (Order)" "Unassigned RBG-13-54-9 (Order)"
[55] "Unassigned Saprospiraceae (Family)" "Unassigned SJA-15 (Order)"
[57] "Unassigned Sphingobacteriales (Order)" "Unassigned Sutterellaceae (Family)"
[59] "Unassigned Thermodesulfovibrionia (Class)" "Unassigned WD2101 soil group (Family)"
[61] "Unassigned Woesearchaeales (Order)" "Venenivibrio"

$South.Thailand [1] "AAP99" "Ahniella"
[3] "Calenema" "Calothrix PCC-6303"
[5] "Candidatus Alysiosphaera" "Candidatus Chloroploca"
[7] "Candidatus Gloeomargarita" "Chloracidobacterium"
[9] "Chloroflexus" "Chthonomonas"
[11] "Cytophaga" "DSSF69"
[13] "Elioraea" "GBChlB"
[15] "Geitlerinema PCC-8501" "Gemmata"
[17] "Leptolyngbya ANT.L52.2" "Meiothermus"
[19] "Pseudomonas" "Roseiflexus"
[21] "Rubritepida" "Sandaracinobacter"
[23] "Thermosynechococcus BP-1" "Turneriella"
[25] "Unassigned A4b (Family)" "Unassigned Acidobacteriae (Class)"
[27] "Unassigned Anaerolineaceae (Family)" "Unassigned Armatimonadota (Phylum)"
[29] "Unassigned Bacteroidia (Class)" "Unassigned Eurycoccales (Order)"
[31] "Unassigned Gemmataceae (Family)" "Unassigned Kapabacteriales (Order)"
[33] "Unassigned Leptolyngbyaceae (Family)" "Unassigned Methylacidiphilaceae (Family)" [35] "Unassigned Microscillaceae (Family)" "Unassigned mle1-27 (Order)"
[37] "Unassigned Planctomycetota (Phylum)" "Unassigned Pseudanabaenaceae (Family)"
[39] "Unassigned RBG-13-54-9 (Order)" "Unassigned Saprospiraceae (Family)"
[41] "Unassigned Sutterellaceae (Family)" "Unassigned WD2101 soil group (Family)"

$Central.Thailand [1] "Chloracidobacterium" "DSSF69"
[3] "GBChlB" "Pseudomonas"
[5] "Raineya" "Roseiflexus"
[7] "Rubritepida" "Sandaracinobacter"
[9] "Thermoflexibacter" "Turneriella"
[11] "Unassigned A4b (Family)" "Unassigned Armatimonadota (Phylum)"
[13] "Unassigned Bacteria (Kingdom)" "Unassigned Flammeovirgaceae (Family)" [15] "Unassigned Gemmataceae (Family)" "Unassigned Kapabacteriales (Order)"
[17] "Unassigned Leptolyngbyaceae (Family)" "Unassigned mle1-27 (Order)"
[19] "Unassigned Pseudanabaenaceae (Family)" "Unassigned RBG-13-54-9 (Order)"
[21] "Unassigned Saprospiraceae (Family)"

I would like to know which genes are unique to each region as well as each of their intersections (which genera are common between the regions).

When I use intersect(x[[2]], x[[3]]) - this only provides genes that are common for 2 regions. This does not work for if I need to know which 6 genera are common to all the 6 regions.

Also the extractInfo() function is not recognized or unless I am not using it correctly. It gives me an error as follows.

extractInfo(venn.95) Error in extractInfo(venn.95) : could not find function "extractInfo"

mmahmoudian commented 1 year ago

@chrissy005 The function extractInfo() is in the git and not yet published on CRAN. Therefore you need to install venn package from Github ti get this new function.

But if you want to know the intersections, you can write 5 intersect() or you can simply use the following to run intersect() on all the elements:

Reduce(f = "intersect",
       x = core.genus.all.clusters.95,
       accumulate = FALSE)
dusadrian commented 1 year ago

@chrissy005

It is possible to install the latest development version using this command (in a fresh instance of R):

install.packages("venn", repos = "dusadrian.r-universe.dev")

The r-universe website takes about one hour to build from the latest sources, but alternatively you can install the absolute freshest sources from GitHub, in which case you need to install package "remotes", then:

library(remotes)
install_github("dusadrian/venn")

Since your object contains rather long set names (countries), I've also added an additional argument called "use.names". If not activated, the result will display the set numbers instead of their names. Something like:

library(venn)
extractInfo(core.genus.all.clusters.95, what = "intersections")

will give you all elements from all existing intersections.

chrissy005 commented 1 year ago

@mmahmoudian , Thank you for that. I did run something similar as you suggested: Reduce(intersect, core.genus.all.clusters.95 ) However, this only give me the 6 common elements to all the 6 sets. I would also like to be able to access other intersections that are common between just 2, 3, 4 or 5 sets of any.

dusadrian commented 1 year ago

Do please try to use above function extractInfo(), it should give you all intersections.

chrissy005 commented 1 year ago

@dusadrian , Thank you so much for writing in the codes for what was requested and is extremely useful.

This works wonderfully now. I now have all that I need.

This looks as follows:

extractInfo(core.genus.all.clusters.95, what = "intersections", use.names = TRUE) $Central.Thailand [1] "Unassigned Flammeovirgaceae (Family)"

$South.Thailand [1] "Unassigned Planctomycetota (Phylum)"

$South.Thailand:Central.Thailand [1] "Pseudomonas" "Unassigned Leptolyngbyaceae (Family)"

$Singapore [1] "Caldimonas" "Defluviicoccus"
[3] "DSSD61" "Fischerella PCC-9339"
[5] "Microvirga" "MTP1"
[7] "SM1A02" "Streptococcus"
[9] "Thermodesulfovibrio" "Unassigned 11-24 (Order)"
[11] "Unassigned Alphaproteobacteria (Class)" "Unassigned BSV26 (Family)"
[13] "Unassigned Chloroflexaceae (Family)" "Unassigned Gammaproteobacteria (Class)"
[15] "Unassigned Gemmatimonadaceae (Family)" "Unassigned Myxococcaceae (Family)"
[17] "Unassigned SJA-15 (Order)" "Unassigned Thermodesulfovibrionia (Class)" [19] "Unassigned Woesearchaeales (Order)"

$Singapore:South.Thailand [1] "Candidatus Alysiosphaera" "Geitlerinema PCC-8501"
[3] "Thermosynechococcus BP-1" "Unassigned Methylacidiphilaceae (Family)"

$South.Malaysia [1] "Phaselicystis" "Unassigned Flavobacteriales (Order)"
[3] "Unassigned Rhodanobacteraceae (Family)" "Unassigned S-BQ2-57 soil group (Order)" [5] "Unassigned SJA-28 (Order)" "Unassigned SM1A07 (Order)"

$South.Malaysia:South.Thailand:Central.Thailand [1] "DSSF69"

$South.Malaysia:Singapore [1] "Rivibacter" "Unassigned Cytophagales (Order)"

$South.Malaysia:Singapore:Central.Thailand [1] "Thermoflexibacter"

$South.Malaysia:Singapore:South.Thailand [1] "Cytophaga" "Leptolyngbya ANT.L52.2"
[3] "Unassigned Sutterellaceae (Family)"

$South.Malaysia:Singapore:South.Thailand:Central.Thailand [1] "Turneriella" "Unassigned mle1-27 (Order)"

$North.Malaysia [1] "Candidatus Xiphinematobacter" "Exilispira"
[3] "IheB3-7" "Unassigned Bacteroidetes VC2.1 Bac22 (Order)" [5] "Unassigned Burkholderiales (Order)" "Unassigned Sva0485 (Phylum)"
[7] "Unassigned vadinHA49 (Class)"

$North.Malaysia:South.Thailand [1] "Unassigned Eurycoccales (Order)"

$North.Malaysia:Singapore [1] "Curvibacter" "Methylothermus"
[3] "Unassigned Acetothermiia (Class)" "Unassigned Paludibaculum (Order)"
[5] "Unassigned Sphingobacteriales (Order)"

$North.Malaysia:Singapore:South.Thailand [1] "Unassigned Microscillaceae (Family)"

$North.Malaysia:South.Malaysia [1] "Rhodovarius" "Unassigned Chitinophagales (Order)" [3] "Unassigned Planctomycetes (Class)"

$North.Malaysia:South.Malaysia:South.Thailand [1] "Candidatus Gloeomargarita"

$North.Malaysia:South.Malaysia:Singapore [1] "Ignavibacterium" "Telmatocola" "Tepidimonas"

$North.Malaysia:South.Malaysia:Singapore:South.Thailand [1] "Gemmata" "Unassigned Acidobacteriae (Class)"
[3] "Unassigned Anaerolineaceae (Family)"

$North.Malaysia:South.Malaysia:Singapore:South.Thailand:Central.Thailand [1] "Sandaracinobacter" "Unassigned Kapabacteriales (Order)"

$North.Thailand [1] "Desulfomicrobium" "Fimbriiglobus"
[3] "Thermus" "Unassigned Cyanobacteriales (Order)"

$North.Thailand:Singapore [1] "Venenivibrio"

$North.Thailand:South.Malaysia:South.Thailand [1] "AAP99" "Calenema" "Calothrix PCC-6303"

$North.Thailand:North.Malaysia [1] "Unassigned SBR1031 (Order)"

$North.Thailand:North.Malaysia:Singapore [1] "Unassigned Hydrogenophilaceae (Family)"

$North.Thailand:North.Malaysia:Singapore:South.Thailand [1] "Chloroflexus"

$North.Thailand:North.Malaysia:South.Malaysia [1] "Unassigned Phormidiaceae (Family)" "Unassigned RD017 (Order)"

$North.Thailand:North.Malaysia:South.Malaysia:South.Thailand [1] "Chthonomonas"

$North.Thailand:North.Malaysia:South.Malaysia:South.Thailand:Central.Thailand [1] "Chloracidobacterium" "GBChlB"
[3] "Rubritepida" "Unassigned Pseudanabaenaceae (Family)"

$North.Thailand:North.Malaysia:South.Malaysia:Singapore:Central.Thailand [1] "Raineya" "Unassigned Bacteria (Kingdom)"

$North.Thailand:North.Malaysia:South.Malaysia:Singapore:South.Thailand [1] "Ahniella" "Candidatus Chloroploca"
[3] "Elioraea" "Meiothermus"
[5] "Unassigned Bacteroidia (Class)" "Unassigned WD2101 soil group (Family)"

$North.Thailand:North.Malaysia:South.Malaysia:Singapore:South.Thailand:Central.Thailand [1] "Roseiflexus" "Unassigned A4b (Family)"
[3] "Unassigned Armatimonadota (Phylum)" "Unassigned Gemmataceae (Family)"
[5] "Unassigned RBG-13-54-9 (Order)" "Unassigned Saprospiraceae (Family)"

dusadrian commented 1 year ago

Glad it helps, it means this issue can be closed now.