gaospecial / ggVennDiagram

A 'ggplot2' implement of Venn Diagram.
https://gaospecial.github.io/ggVennDiagram/
GNU General Public License v3.0
277 stars 37 forks source link

Scale region labels size with count #53

Closed MartinWitt closed 7 months ago

MartinWitt commented 1 year ago

Hey is it possible to scale the label size of each region with the count of it? My current code is the following which produces:

library(rjson)
library(ggVennDiagram)
library(ggplot2)
library(dplyr)
# READ THE json files from ./resultCalc each subfolder is a project
# 1. read the json file
files <- list.files(path = "sbom2023_plot/resultCalc", full.names = TRUE)
for (project in files) {
  venn_data <- list() # Initialize an empty list to store Venn diagram data
  producers <- list.files(path = project, pattern = "*.json", full.names = TRUE)
  producer_genes <- list() # Initialize an empty list to store genes for each producer

  for (producer in producers) {
    content <- fromJSON(file = producer)
    # get the array with the key "truePositive"
    truePositives <- content$truePositive
    # convert each json object to a string $groupdID:$artifactId:$version
    list_of_strings <- list()
    for (truePositive in truePositives) {
      # get the groupID
      groupID <- truePositive$groupId
      # get the artifactID
      artifactID <- truePositive$artifactId
      # get the version
      version <- truePositive$version
      # concatenate the strings
      string <- paste(groupID, artifactID, version, sep = ":")
      # add the string to the list
      list_of_strings <- append(list_of_strings, string)
    }
    producer_genes[[tools::file_path_sans_ext(basename(producer))]] <- unlist(list_of_strings)
  }

  # Generate the Venn diagram
  venn <- Venn(producer_genes)
  plot <- ggVennDiagram(producer_genes, label = "count", edge_size = 2) +
    scale_color_brewer(palette = "Paired") +
    theme(
      plot.background = element_rect(fill = "white"),
    )
  data <- process_data(venn)
  # Save the Venn diagram with high dpi
  ggsave(
    filename = paste("./venns", paste(basename(project), "pdf", sep = "."), sep = "/"),
    plot = plot,
    width = 13, height = 13, units = "in",
    dpi = 1200
  )
}
image

I also took a look at #50, but haven't understood how to do this for my diagram. I am a complete R beginner and help is appreciated.

gaospecial commented 1 year ago

From the example of #50 , we can do this by several extra configs:

genes <- paste0("gene",1:20)
set.seed(20230406)
gene_list <- list(A = sample(genes,5),
                  B = sample(genes,5),
                  C = sample(genes,5),
                  D = sample(genes,5),
                  E = sample(genes,5),
                  F = sample(genes,5),
                  G = sample(genes,5))

library(ggVennDiagram)
library(ggplot2)
library(dplyr)
venn <- Venn(gene_list)
data <- process_data(venn)
ggplot() +
  # 1. region count layer
  geom_sf(aes(fill = count), data = venn_region(data)) +
  # 2. set edge layer
  geom_sf(aes(color = id), data = venn_setedge(data), show.legend = FALSE) +
  # 3. set label layer
  geom_sf_text(aes(label = name), data = venn_setlabel(data)) +
  # 4. region label layer
  geom_sf_label(aes(label = count), data = venn_region(data) %>% filter(count != 0), alpha = 0.5) +
  theme_void()

The region labels are plotted by the geom_sf_label() function, if you want to map size to the count of region items, read the count in data, and set the size parameter in aes(size = count). After this, set the label scale with the scale_size_* function.

Have a try and give me feedback.

gaospecial commented 1 year ago

By the way, exporting the figure with exports::graph2ppt() is also an alternative, as you can change items in PPT much easier without access to some uncommon ggplot functions.