thomasp85 / ggforce

Accelerating ggplot2
https://ggforce.data-imaginist.com
Other
916 stars 105 forks source link

geom_mark_hull error when one group contains two or less points #320

Open PeSteff opened 6 months ago

PeSteff commented 6 months ago

Hello,

EDIT: in the first comment there is a better reproducable example.

I am using ggplot2 with the packages ggforce (geom_mark_hull) and ggrepel (geom_text_repel) to plot the results of my NMDS analysis (vegan). I have managed a 1) nice plot with one set of variables, yet 2) choosing another variable to base the hulls on, it was not possible but returned a (for me) cryptic error. I have checked the class of the variables, it's the same (character) and 3) played around with many options of geom_mark_hull, nothing helped. I created a small subset of my data (provided below), still the same error. Also I created 4) mock_data and tried to reproduce the error but I couldn't. Which is good, so the code works really well. But I also couldn't find out what is wrong with my data or what to do to resolve the error.

1) nice plot

df <- structure(list(depth_ges = c(17, 17, 32, 32, 50, 50, 67, 67, 
83, 83, 98, 98, 112, 112, 126, 126, 139, 139), lith_unit = c(1, 
1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3), NMDS1 = c(-2.42, 
-2, -1.97, -2.01, -1.66, -1.7, -1.51, -1.44, -1.21, -1.19, -1.08, 
-0.84, -0.7, -0.71, -0.81, -0.84, -0.67, -0.59), NMDS2 = c(0, 
0.67, 0.42, 0.07, 0.28, 0.5, 0.14, 0.62, 0.15, 0.47, -0.27, 0.14, 
-0.74, -0.54, -0.26, -0.11, -0.31, -0.03), cl_nr = c("surface", 
"surface", "surface", "surface", "surface", "surface", "surface", 
"surface", "surface", "surface", "subsurface", "subsurface", 
"subsurface", "subsurface", "subsurface", "subsurface", "subsurface", 
"subsurface"), lit = c("unit_1", "unit_1", "unit_2", "unit_2", 
"unit_2", "unit_2", "unit_2", "unit_2", "unit_2", "unit_2", "unit_2", 
"unit_2", "unit_2", "unit_2", "unit_3", "unit_3", "unit_3", "unit_3"
), lit_rev = c("unit_3", "unit_3", "unit_2", "unit_2", "unit_2", 
"unit_2", "unit_2", "unit_2", "unit_2", "unit_2", "unit_2", "unit_2", 
"unit_2", "unit_2", "unit_1", "unit_1", "unit_1", "unit_1")), class = "data.frame", row.names = c(NA, 
18L))

str(df)
'data.frame':   18 obs. of  7 variables:
 $ depth_ges: num  17 17 32 32 50 50 67 67 83 83 ...
 $ lith_unit: num  1 1 2 2 2 2 2 2 2 2 ...
 $ NMDS1    : num  -2.42 -2 -1.97 -2.01 -1.66 -1.7 -1.51 -1.44 -1.21 -1.19 ...
 $ NMDS2    : num  0 0.67 0.42 0.07 0.28 0.5 0.14 0.62 0.15 0.47 ...
 $ cl_nr    : chr  "surface" "surface" "surface" "surface" ...
 $ lit      : chr  "unit_1" "unit_1" "unit_2" "unit_2" ...
 $ lit_rev  : chr  "unit_3" "unit_3" "unit_2" "unit_2" ...

R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

library(ggplot2) # 3.4.4
library(ggforce) # 0.4.2
library(ggrepel) # 0.9.5

cl_nr_plot <- ggplot(data = df, aes(x = NMDS1, y = NMDS2)) + #
  geom_point(data = df, aes(x = NMDS1, y = NMDS2, color = depth_ges), size = 1) + #
  ggtitle("NMDS ordination based on bray-curtis distance.")+
  scale_colour_gradientn(colours = c("chartreuse3", "darkgoldenrod1", "brown2", "blue")) +
  geom_text_repel(data = df,aes(x = NMDS1, y = NMDS2, label = depth_ges, color = depth_ges), size = 3) + 
  geom_mark_hull(data = df, aes(x = NMDS1, y = NMDS2, fill = cl_nr, label = cl_nr), label.fontsize = 8, 
                 concavity = 4, expand = unit(2.5, "mm"), con.cap = 0, inherit.aes = FALSE) + 
  theme_bw() +
  theme(title = element_text(size = 8)) +
  guides(color = guide_colorbar(title = "depth [cm]"))
cl_nr_plot

2) different variable used for the hulls

lit_plot <- ggplot(data = df, aes(x = NMDS1, y = NMDS2)) + #
  geom_point(data = df, aes(x = NMDS1, y = NMDS2, color = depth_ges), size = 1) + #
  ggtitle("NMDS ordination based on bray-curtis distance.")+
  scale_colour_gradientn(colours = c("chartreuse3", "darkgoldenrod1", "brown2", "blue")) +
  geom_text_repel(data = df,aes(x = NMDS1, y = NMDS2, label = depth_ges, color = depth_ges), size = 3) + 
  geom_mark_hull(data = df, aes(x = NMDS1, y = NMDS2, fill = lit, label = lit), label.fontsize = 8, 
                 concavity = 4, expand = unit(2.5, "mm"), con.cap = 0, inherit.aes = FALSE) + 
  theme_bw() +
  theme(title = element_text(size = 8)) +
  guides(color = guide_colorbar(title = "depth [cm]"))
lit_plot

Running the above piece of code the error appears when I want to see the plot with the last line. The error is this: "Error in anchors[[i]] : subscript out of bounds"

3) playing around The one thing I found to have a plot created at all (and without any error message) was to set label = NULL in the aes of geom_mark_hull. But this resulted in a plot where only the first of three hulls was drawn. If I reversed the order of my groups, all but the first hulls were drawn.

lit_plot <- ggplot(data = df, aes(x = NMDS1, y = NMDS2)) + #
  geom_point(data = df, aes(x = NMDS1, y = NMDS2, color = depth_ges), size = 1) + #
  ggtitle("NMDS ordination based on bray-curtis distance.")+
  scale_colour_gradientn(colours = c("chartreuse3", "darkgoldenrod1", "brown2", "blue")) +
  geom_text_repel(data = df,aes(x = NMDS1, y = NMDS2, label = depth_ges, color = depth_ges), size = 3) + 
  geom_mark_hull(data = df, aes(x = NMDS1, y = NMDS2, fill = lit, label = NULL), label.fontsize = 8,  # changed: label = NULL
                 concavity = 4, expand = unit(2.5, "mm"), con.cap = 0, inherit.aes = FALSE) + 
  theme_bw() +
  theme(title = element_text(size = 8)) +
  guides(color = guide_colorbar(title = "depth [cm]"))
lit_plot

lit_plot <- ggplot(data = df, aes(x = NMDS1, y = NMDS2)) + #
  geom_point(data = df, aes(x = NMDS1, y = NMDS2, color = depth_ges), size = 1) + #
  ggtitle("NMDS ordination based on bray-curtis distance.")+
  scale_colour_gradientn(colours = c("chartreuse3", "darkgoldenrod1", "brown2", "blue")) +
  geom_text_repel(data = df,aes(x = NMDS1, y = NMDS2, label = depth_ges, color = depth_ges), size = 3) + 
  geom_mark_hull(data = df, aes(x = NMDS1, y = NMDS2, fill = lit_rev, label = NULL), label.fontsize = 8,  # changed: fill = lit_rev
                 concavity = 4, expand = unit(2.5, "mm"), con.cap = 0, inherit.aes = FALSE) + 
  theme_bw() +
  theme(title = element_text(size = 8)) +
  guides(color = guide_colorbar(title = "depth [cm]"))
lit_plot

4) creating mock data and test

gr <- c("g1","g2","g3","g4","g5")
mock_dat <- data.frame(NMDS1 = sort(rep(1:5,3)) - rep(c(0,0.2,0.3),5),
                       NMDS2 = c(2,2,2,1,1,1,1,1,1,2,2,2,2,2,2) - c(rep(c(-0.1,0,0.2),2),rep(c(0,-0.25,0.1),3)),
                       group5 = sort(rep(gr,3)),
                       group3 = sort(rep(gr[1:3],5)),
                       depth = c(0.5,rep(seq(from = 0.2, by = 0.1, length.out = 7),2)))

mock_plot <- ggplot(data = mock_dat, aes(x = NMDS1, y = NMDS2)) + #
  geom_point(data = mock_dat, aes(x = NMDS1, y = NMDS2, color = depth), size = 1) + #
  ggtitle("MOCK data plot")+
  scale_colour_gradientn(colours = c("chartreuse3", "darkgoldenrod1", "brown2", "blue")) +
  geom_text_repel(data = mock_dat,aes(x = NMDS1, y = NMDS2, label = depth, color = depth), size = 3) + 
  geom_mark_hull(data = mock_dat, aes(x = NMDS1, y = NMDS2, fill = group5, label = group5), label.fontsize = 8, 
                 concavity = 4, expand = unit(2.5, "mm"), con.cap = 0, inherit.aes = FALSE) + 
  theme_bw() +
  theme(title = element_text(size = 8)) +
  guides(color = guide_colorbar(title = "depth [cm]"))
mock_plot

Recreating such a plot with made up data worked fine as well. I am very puzzled what this error message means, I tried to make sense of it by looking at source code "mark_hull.R" here on github. But this is beyond my level of understanding and I was not able to find where this error message was pointing to.

I would appreciate any help to figure this out.

PeSteff commented 6 months ago

After playing with the data even more I think I was able to narrow down the problem and reproduce it on dummy data. Whenever a group includes two or less data points the error (Error in anchors[[i]] : subscript out of bounds) occurs. And I think it has something to do with label placement, as without labelling, a hull was drawn around the group with two points. Yet the graph still remains incomplete as is demonstrated with the code below.

Is there a way to fix this or work around? I do not need the labels in the plot as a nice legend is produced either way. But I have not found a way to simply switch off the labelling and have all the hulls created.

# dummy data
gr <- c("A","B","C","D","E")
mock_dat <- data.frame(NMDS1 = sort(rep(1:5,3)) - rep(c(0,0.2,0.3),5),
                       NMDS2 = c(2,2,2,1,1,1,1,1,1,2,2,2,2,2,2) - c(rep(c(-0.1,0,0.2),2),rep(c(0,-0.25,0.1),3)),
                       group3 = sort(rep(gr[1:3],5)),
                       group_with2 = c(rep("A",2),rep("B",9),rep("C",4)),
                       group_with2_rev = c(rep("A",4),rep("B",9),rep("C",2)))

# trying to create a plot with hulls when one group contains only 2 data points
p1<-ggplot(data=mock_dat,aes(x=NMDS1,y=NMDS2))+
  geom_point()+
  geom_mark_hull(aes(fill=group_with2, label=group_with2))
p1

# setting label = NULL at least draws the first hull, with the group of two
p2<-ggplot(data=mock_dat,aes(x=NMDS1,y=NMDS2))+
  geom_point()+
  geom_mark_hull(aes(fill=group_with2, label=NULL))
p2

# when I reverse the order of groups, so that the group with 2 point is last (C in A, B, C) all other hulls are drawn
p3 <-ggplot(data=mock_dat,aes(x=NMDS1,y=NMDS2))+
  geom_point()+
  geom_mark_hull(aes(fill=group_with2_rev, label=NULL))
p3

# simply removing the "label" argument doesn't fix the problem as still not all hulls are created
p4 <-ggplot(data=mock_dat,aes(x=NMDS1,y=NMDS2))+
  geom_point()+
  geom_mark_hull(aes(fill=group_with2))
p4

# the plot is produced when all groups contain at least 3 data points
p5 <-ggplot(data=mock_dat,aes(x=NMDS1,y=NMDS2))+
  geom_point()+
  geom_mark_hull(aes(fill=group3, label=group3))
p5
thomasp85 commented 6 months ago

I think this is fixed in the development version. Can I get you to test that out?

PeSteff commented 6 months ago

Thanks for the recommendation. I will test it. At the moment I get an error when trying to install the development version using

devtools::install_github("thomasp85/ggforce")

Something including a lot of "warning: 'if constexpr' only available with '-std=c++17' or '-std=gnu++17'" and several errors from concaveman (see attached file)

I am trying to update R and then I will try again.

error_developmental_version_ggforce.txt

PeSteff commented 6 months ago

I tried the very same code with the developmental version and the newest R version (4.3.3) and it worked without any problems!

Thank you so much!