noriakis / ggkegg

Analyzing and visualizing KEGG information using the grammar of graphics
https://noriakis.github.io/software/ggkegg
MIT License
210 stars 15 forks source link

Improving Resolution #17

Closed Ramirj closed 8 months ago

Ramirj commented 8 months ago

Hi @noriakis,

Is there any way to improve the resolution of a graph made using the overlay_raw_map function? I made a pathway diagram using some of your code from the documentation, but the resolution is not high enough to be published. Thank you for your help.

noriakis commented 8 months ago

Hi @Ramirj,

Thanks for raising this important point. What code did you use to save the plot?

The plot can be saved via ggsave the same as the formal ggplot object. However, as we use annotation_raster for overlay_raw_map, the resolution would be unclear in some environments.

Also, if you would prefer the sharp image, you could disable interpolation of annotation_raster by interpolate=FALSE in overlay_raw_map (please reinstall the devel branch using devtools::install_github).

I will try to implement some functions to save the plot by using the dimensions of the raw image anyway (ggkeggsave, presumably).

noriakis commented 8 months ago

Hi, ggkeggsave is implemented in the devel branch of the repository. Please try the function and see if it helps.

Ramirj commented 8 months ago

Hi @noriakis, I'm not sure if I implemented your code into my code correctly, but the resolution hasn't seemed to improved still. Also, for some reason it creates two plots instead of one.

Load libraries

library(ggkegg) library(BiocFileCache) library(viridis) library(magick) # Added magick library

Define overlay_raw_map function

overlay_raw_map <- function(pid = NULL, directory = NULL, transparent_colors = c("#FFFFFF", "#BFBFFF", "#BFFFBF", "#7F7F7F", "#808080"), adjust = TRUE, adjust_manual_x = NULL, adjust_manual_y = NULL, clip = FALSE, use_cache = TRUE, interpolate = TRUE) { structure(list(pid = pid, transparent_colors = transparent_colors, adjust = adjust, clip = clip, adjust_manual_x = adjust_manual_x, adjust_manual_y = adjust_manual_y, directory = directory, use_cache = use_cache, interpolate = interpolate), class = "overlay_raw_map") }

Define ggplot_add_overlay_raw_map function

ggplot_add_overlay_raw_map <- function(object, plot, object_name) { if (is.null(object$pid)) { infer <- plot$data$pathway_id |> unique() object$pid <- infer[!is.na(infer)] } if (!grepl("[[:digit:]]", object$pid)) { warning("Looks like not KEGG ID for pathway") return(1) }

Construct the image URL, download, and cache

url <- paste0(as.character(pathway(object$pid, use_cache = object$use_cache, directory = object$directory, return_image = TRUE)))

if (object$use_cache) { bfc <- BiocFileCache() path <- bfcrpath(bfc, url)
} else { path <- paste0(object$pid, ".png") if (!is.null(object$directory)) { path <- paste0(object$directory, "/", path) } download.file(url = url, destfile = path, mode = 'wb') }

Load, transparent, and rasterize the image

magick_image <- image_read(path) img_info <- image_info(magick_image) w <- img_info$width h <- img_info$height

for (col in object$transparent_colors) { magick_image <- magick_image |> image_transparent(col) }

ras <- as.raster(magick_image)

xmin <- 0 xmax <- w ymin <- -1 * h ymax <- 0

if (object$clip) { ras <- ras[seq_len(nrow(ras) - 1), seq_len(ncol(ras) - 1)] }

if (!is.null(object$adjust_manual_x)) { object$adjust <- FALSE xmin <- xmin + object$adjust_manual_x xmax <- xmax + object$adjust_manual_x }

if (!is.null(object$adjust_manual_y)) { object$adjust <- FALSE ymin <- ymin + object$adjust_manual_y ymax <- ymax + object$adjust_manual_y }

if (object$adjust) { xmin <- xmin - 0.5 xmax <- xmax - 0.5 }

p <- plot + annotation_raster(ras, xmin = xmin, ymin = ymin, xmax = xmax, ymax = ymax, interpolate = object$interpolate) + coord_fixed(xlim = c(xmin, xmax), ylim = c(ymin, ymax))

attr(p, "original_width") <- w attr(p, "original_height") <- h return(p) }

Fetch and cache RN to EC map

url <- "https://rest.kegg.jp/link/reaction/ec" bfc <- BiocFileCache() path <- bfcrpath(bfc, url) convert <- data.frame(data.table::fread(path, header = FALSE, sep = "\t")) rntoec <- convert$V1 |> strsplit(":") |> vapply("[", 2, FUN.VALUE = "a") |> setNames(convert$V2)

Define the pathway ID

pathway_id <- "dosa00010" # Change depending on the pathway you want to map

Create a graph of the pathway

g <- pathway(pathway_id) |> mutate(ec = rntoec[reaction])

Define a vector of node names to highlight

nodes_to_highlight <- reaction

Add log fold changes (LFCs) to the graph

LFC_vector <- LFCs # Make sure LFCs is a vector with values for each corresponding reaction number

Ensure that LFC column is created in the graph

g <- g %>% mutate(LFC = ifelse(reaction %in% nodes_to_highlight, LFC_vector, NA))

Create a pathway diagram with gradient colors to visualize LFCs

gg <- g |> filter(type %in% c("compound", "gene")) |> ggraph(layout = "manual", x = x, y = y) + geom_node_rect(aes(fill = LFC), size = 5) + scale_fill_gradient2(low = "red", high = "green", breaks = c(-8, -4, -1, 1, 4, 8), limits = c(-8, 8))

Add the overlay_raw_map and ggplot_add_overlay_raw_map to the pipeline

gg <- gg + overlay_raw_map() + ggplot_add_overlay_raw_map(object = overlay_raw_map(), plot = gg, object_name = "overlay_raw_map") + theme_minimal() + theme_void() + ggtitle("Induced (Green) and Repressed (Red) Genes in Glycolysis Pathway - 8 hr Timepoint")

gg

ggkeggsave <- function(filename, plot, dpi=300, wscale=90, hscale=90) { ggsave(filename, plot, dpi=dpi, width=attr(plot, "original_width")/wscale, height=attr(plot, "original_height")/hscale, units="in") }

ggkeggsave("pathway_plot.png", gg)

Screenshot 2024-02-01 at 6 47 31 PM
noriakis commented 8 months ago

Hi, could you reinstall the library on the devel branch of the repository, and try again? Also, lowering the wscale and hscale parameters could help, but the resolution would not be better than the original PNG from the KEGG. KEGG provides the 2x resolution image per pathway, so supporting this will be an important option (I'll be working on this!).

noriakis commented 8 months ago

Hi @Ramirj, for this issue, if you have still problems, please consider checking #19.

Ramirj commented 8 months ago

Hi @noriakis,

The quality looks great in the examples you provided. I tried to implement the "output_overlay_image" code you provided to my own code and received this error message:

Error in base::nchar(wide_chars$test, type = "width") : lazy-load database '/Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library/cli/R/sysdata.rdb' is corrupt In addition: Warning messages: 1: In base::nchar(wide_chars$test, type = "width") : restarting interrupted promise evaluation 2: In base::nchar(wide_chars$test, type = "width") : internal error -3 in R_decompress1

Do you know what could be causing this error? Here is my code:

######### ###########

Load libraries

library(ggkegg) library(BiocFileCache) library(viridis)

Fetch and cache RN to EC map

url <- "https://rest.kegg.jp/link/reaction/ec" bfc <- BiocFileCache() path <- bfcrpath(bfc, url) convert <- data.frame(data.table::fread(path, header = FALSE, sep = "\t")) rntoec <- convert$V1 |> strsplit(":") |> vapply("[", 2, FUN.VALUE = "a") |> setNames(convert$V2)

Define the pathway ID

pathway_id <- "dosa00270" # Change depending on the pathway you want to map

Create a graph of the pathway

g <- pathway(pathway_id) |> mutate(ec = rntoec[reaction])

Define a vector of node names to highlight

nodes_to_highlight <- reaction

Add log fold changes (LFCs) to the graph

LFC_vector <- LFCs # Make sure LFCs is a vector with values for each corresponding reaction number

Ensure that LFC column is created in the graph

g <- g %>% mutate(LFC = ifelse(reaction %in% nodes_to_highlight, LFC_vector, NA))

Create a pathway diagram with gradient colors to visualize LFCs

gg <- g |> filter(type %in% c("compound", "gene")) |> ggraph(g, layout = "manual", x = x, y = y) + geom_node_rect(aes(fill = LFC), size = 5) + geom_node_rect(aes(fill1=x, xmin=xmin2, xmax=xmax2, ymin=ymin2, ymax=ymax2, filter=type=="gene"))+ geom_node_rect(aes(fill2=y, xmin=xmin2+width, xmax=xmax2, ymin=ymin2, ymax=ymax2, filter=type=="gene"))+ scale_fill_gradient2(low = "red", high = "green", breaks = c(-8, -4, -1, 1, 4, 8), limits = c(-8, 8)) + theme_minimal() + theme_void() + ggtitle("Induced (Green) and Repressed (Red) Genes in Cysteine and Methionine Met Pathway - 8 hr Timepoint")

output_overlay_image(gg, high_res=TRUE, use_cache=TRUE, with_legend_image=TRUE, res=100, legend_space=100, out="test4.png")

> [1] "test4.png"

knitr::include_graphics("test4.png")

gg

############ #########

Thank you for taking the time to add this feature by the way.

Ramirj commented 8 months ago

I was able to fix what was causing the error message, but when I ran the code I got this result:

Screenshot 2024-02-05 at 12 08 35 PM