Closed abiyug closed 5 years ago
For a dataset of this size, please make a reprex with computationally generated data rather than a gigantic dput dump.
I can't reproduce your crash with a similarly sized data set, but it's clear that you're misusing the ggplot2 API and that causes the slow rendering. You're drawing 10000 arrows and 10000 text labels on top of each other. Once that is fixed, rendering is reasonably fast.
library(ggplot2)
df <- data.frame(trans = 10*rnorm(10000))
ggplot(df, aes(x = trans)) +
geom_density() +
geom_vline(xintercept = mean(df$trans), col = "green", size =2) +
geom_curve(data = data.frame(x = 1), aes(x = 10, y = .15, xend = 4.4, yend = .18),
colour = "#555555", size=0.5, curvature = 0.3,
arrow = arrow(length = unit(0.03, "npc"))) +
geom_text(data = data.frame(x = 1), aes(x= 9, y = .15, label= paste0("Average number of items\n per transaction is: ",
round(mean(mtcars$mpg),2)), colour = "blue", family = "Times New Roman"))
Created on 2019-08-27 by the reprex package (v0.3.0)
If you cut and paste the dput dump, you will get the dataset and should be able to reproduce the error.
Not sure what you mean by 'abuse the api', but I'd appreciate you or someone try and and test because the issue is reproducible.
FYI - dput dump is reprex.
Not abuse, just misuse. Do you understand the difference between yours and Claus's version of code? I guess annotate()
is what you want.
FYI - dput dump is reprex.
It can be a reprex, but please try to provide a minimal reprex. (I couldn't copy and paste your dump, BTW...)
FWIW, I also wasn't able to copy and paste the dput()
dump.
Ok, I understand. Here is a reprex data. And hit the same issue again!
df <- data.frame(var_name = paste0("V", 1:9825),
trans = sample(1:32, 9825, replace = TRUE))
As stated in my original filing. I do not experience the same issue when the data size is small. I used the same script on mtcars. And no problem.
I cannot reproduce your issue with a more recent version of R. You may have to update.
And regardless, please fix your code so you don't draw thousands of annotations.
library(ggplot2)
df <- data.frame(var_name = paste0("V", 1:9825),
trans = sample(1:32, 9825, replace = TRUE))
# incorrect code, draws 9825 annotations
ggplot(df, aes(x = trans)) +
geom_density() +
geom_vline(xintercept = mean(df$trans), col = "green", size =2) +
geom_curve(aes(x = 10, y = .15, xend = 4.4, yend = .18),
colour = "#555555", size=0.5, curvature = 0.3,
arrow = arrow(length = unit(0.03, "npc"))) +
geom_text(aes(x= 9, y = .15, label= paste0("Average number of items\n per transaction is: ",
round(mean(mtcars$mpg),2)), colour = "blue", family = "Times New Roman"))
# correct code, draws 1 annotation
ggplot(df, aes(x = trans)) +
geom_density() +
geom_vline(xintercept = mean(df$trans), col = "green", size =2) +
geom_curve(data = data.frame(x = 1), aes(x = 10, y = .15, xend = 4.4, yend = .18),
colour = "#555555", size=0.5, curvature = 0.3,
arrow = arrow(length = unit(0.03, "npc"))) +
geom_text(data = data.frame(x = 1), aes(x= 9, y = .15, label= paste0("Average number of items\n per transaction is: ",
round(mean(mtcars$mpg),2)), colour = "blue", family = "Times New Roman"))
Created on 2019-08-28 by the reprex package (v0.3.0)
The R version maybe the problem. There is no problem with the code. It would be helpful, if you point what part of my code is the issue. Thanks.
Please carefully compare the two code versions I have posted in my previous comment.
This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/
I am working with a data that has 10 thousand observations and 2 vars. When attempting to add a geom_curve layer, the rendering becomes extremely slow to a point where it distorts the plot and eventually crashes and generate the output below.