noriakis / ggkegg

Analyzing and visualizing KEGG information using the grammar of graphics
https://noriakis.github.io/software/ggkegg
MIT License
215 stars 16 forks source link

highlight_entities function may cause elements to be omitted and element misalignment #33

Closed BioLaoXu closed 3 months ago

BioLaoXu commented 4 months ago

dear ggkegg team,Thanks to ggkegg for providing a more flexible KEGG pathway visualization scheme,But I found some minor problems in the process of using it, which may be caused by my own misunderstanding.

image

hsn=c("K01568","D18Wsu181e")
g <- pathway("mmu00010",group_rect_nudge=0)
g <- g |> mutate(gene1=highlight_set_nodes(hsn,sep=" ",how="any",name="name"),
                 gene2=highlight_set_nodes(hsn,sep=", ",how="any",name="graphics_name"))
gg=ggraph(g, layout="manual", x=x, y=y)
gg+
  overlay_raw_map()+
  geom_node_rect(fill="red",aes(filter=gene1), color="black")+
  geom_node_text(aes(
    label=graphics_name %>% strsplit(",") %>% sapply("[", 1) %>% strsplit("\\.") %>% sapply("[", 1),
    filter=gene1),size=2)+
  geom_node_rect(fill="red",aes(filter=gene2), color="black")+
  geom_node_text(aes(
    label=graphics_name %>% strsplit(",") %>% sapply("[", 1) %>% strsplit("\\.") %>% sapply("[", 1),
    filter=gene2),size=2)+
  theme_void()

#ggkegg::highlight_entities(pathway = "mmu00010",set = hsn,how = "any",name ="graphics_name" ) ## not work

image

in this example,D18Wsu181egene will be oimted

pid="mmu00010" showText="showtext" colorBy="lfc"

g <- pathway(pid,group_rect_nudge=0,directory = getBFCOption("CACHE")) gg=ggraph(g, layout="manual", x=x, y=y) gt=gg$data

tmp=gt%>%mutate(idx = row_number()) gt1=tmp%>%separate_rows(.,graphics_name, convert = TRUE,sep = ", ")%>% mutate(symbol=gsub(" ","",graphics_name)%>%gsub("\.\.\.$","",.)) gt2=tmp%>%separate_rows(.,name, convert = TRUE,sep = " ")%>% mutate(symbol=gsub(" ","",name)%>%gsub("\.\.\.$","",.)) gt1=left_join(gt1,highlight_entities_dt,by="symbol")%>% mutate(sn=get(showText)%>%lapply(.,function(x){ if(x%in%highlight_entities_dt[[showText]]){ return(x) } return(NA) })%>%unlist())%>% .[with(., order(idx,sn,symbol,decreasing = F)),] gt2=left_join(gt2,highlight_entities_dt,by="symbol")%>% mutate(sn=get(showText)%>%lapply(.,function(x){ if(x%in%highlight_entities_dt[[showText]]){ return(x) } return(NA) })%>%unlist())%>% .[with(., order(idx,sn,symbol,decreasing = F)),] gt12=rbind(gt1,gt2)%>% distinct(.,idx,.keep_all = T) gt12$name=gt$name gt12$graphics_name=gt$graphics_name gt=gt12

g=pathway(pid,group_rect_nudge=0,directory = getBFCOption("CACHE")) %>% mutate(sn = gt[[showText]],!!colorBy:=gt[[colorBy]])

method1,more tidy graph

ggraph(g, layout = "manual",x=x,y=y)+ overlay_raw_map(high_res=F, transparent_color="#FFFFFF")+ geom_node_rect(aes(filter=!is.na(sn),fill=.data[[colorBy]]), color="black")+ scale_fill_gradientn(name=colorBy,colours = scales::alpha(c("#3288BD","#D53E4F"),alpha = .99), space = "lab", breaks=ceiling(seq(min(gt[[colorBy]],na.rm = T),max(gt[[colorBy]],na.rm = T), (max(gt[[colorBy]],na.rm = T)-min(gt[[colorBy]],na.rm = T))/4)), guide = guide_colorbar(order = 3))+ geom_node_text(aes(filter=!is.na(sn),label=sn), size=2)+ theme_void()

method2

ggraph(g, layout =gt,x=x,y=y)+ overlay_raw_map(high_res=F, transparent_color="#FFFFFF")+ geom_node_rect(aes(filter=!is.na(sn),fill=.data[[colorBy]]), color="black")+ scale_fill_gradientn(name=colorBy,colours = scales::alpha(c("#3288BD","#D53E4F"),alpha = .99), space = "lab", breaks=ceiling(seq(min(gt[[colorBy]],na.rm = T),max(gt[[colorBy]],na.rm = T), (max(gt[[colorBy]],na.rm = T)-min(gt[[colorBy]],na.rm = T))/4)), guide = guide_colorbar(order = 3))+ geom_node_text(aes(filter=!is.na(sn),label=sn), size=2)+ theme_void()

![image](https://github.com/noriakis/ggkegg/assets/13416926/0a1d238d-a38e-4b60-9dc5-18bcbbbb602c)

With the exception of the misalignment of the elements, the results were as expected

sessionInfo() R version 4.2.1 (2022-06-23) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.4 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] BiocFileCache_2.4.0 dbplyr_2.2.1 tidyr_1.3.0 ggkegg_1.3.1 XML_3.99-0.14 ggraph_2.1.0
[7] ggplot2_3.4.2 igraph_2.0.3 dplyr_1.1.2 tidygraph_1.2.2



 look forward to hearing from you,thanks
noriakis commented 4 months ago

Thank you very much for raising this important point. For point 1, I have implemented the remove_dot option in highlight_entities and highlight_set_nodes now and it should handle the gene names with dots after them (devel and main branches). I will go through point 2 next but thanks again for using the package.

BioLaoXu commented 3 months ago

I'm sorry for the late reply, I found that the code does not have an element misalignment problem in the R.4.4.1 (ggplot2-3.5.1, windows10) environment, but in the R 4.2.1 (ggplot2-3.5.1, ubuntu) environment, the same code does have element misalignment; Maybe it's really the R version issue。

So suggest using a new version of R (>=4.3.0) and I'll close this discussion。

noriakis commented 3 months ago

Thanks for the information!