Open DorisAmoakohene opened 1 month ago
please use only one of these.
the first one (fread) is the best, because there is one method (utils::read.csv) which has a larger slope than others.
please remove all but 3 algos
please increase limit line text size, and direct label text size, so that both are about the same size as axes/tick size.
For direct labels use method=list(cex=1.2, "top.polygons")
where the cex number controls text size
great improvement but please make sure text size is similar for all text on figure (seconds=1 is too small)
you increased the time limit to 1.5 seconds but the size of the text is still the same. To increase text size, use geom_text(size=5) etc
I have added geom_text(size=5)
seconds=1.5 is still too small please revise and/or share code so I can see what is wrong
read.colors <- c(
"readr::read_csv\n(lazy=TRUE)"="#9970AB",
"data.table::fread"="#D6604D",
"utils::read.csv" = "deepskyblue")
n.rows <- 100
seconds.limit <- 5
atime.read.vary.cols <- atime::atime(
N=as.integer(10^seq(2, 6, by=0.5)),
setup={
set.seed(1)
input.vec <- rnorm(n.rows*N)
input.mat <- matrix(input.vec, n.rows, N)
input.df <- data.frame(input.mat)
input.csv <- tempfile()
fwrite(input.df, input.csv)
},
seconds.limit = seconds.limit,
"data.table::fread"={
data.table::fread(input.csv, showProgress = FALSE)
},
"readr::read_csv\n(lazy=TRUE)"={
readr::read_csv(input.csv, progress = FALSE, show_col_types = FALSE, lazy=TRUE)
},
"utils::read.csv"=utils::read.csv(input.csv))
refs.read.vary.cols <- atime::references_best(atime.read.vary.cols)
pred.read.vary.cols <- predict(refs.read.vary.cols)
png("gg.read.3.png", res = 600, width = 18, height = 12, unit = "in")
gg.read.3 <- plot(pred.read.vary.cols)+
geom_text(text = 5)+
theme(
text=element_text(size=35),
axis.text = element_text(size = 20),
axis.title = element_text(size = 20)
)+
scale_x_log10("N = number of columns to read")+
scale_y_log10("Computation time (seconds)
median line, min/max band
over 10 timings")+
facet_null()+
scale_fill_manual(values=read.colors)+
scale_color_manual(values=read.colors)
directlabels::direct.label(gg.read.3, list(cex = 1.2, "top.polygons"))
dev.off()
your problem is here
gg.read.3 <- plot(pred.read.vary.cols)+
geom_text(text = 5)+
geom_text(text=5)
does not draw anything if you do not give it any data set to draw, so the size=5 argument does nothing here.
you need to write your own ggplot code, instead of using plot(pred.read.vary.cols)
does this give the plot you want
This is the ggplot code i am using
png("gg.read.3.png", res = 600, width = 15, height = 10, unit = "in")
gg.read.3 <- ggplot()+
geom_line(data = atime.read.vary.cols$measurements, aes(x = N, y = median, color = expr.name, group = expr.name, fill = expr.name)) +
geom_ribbon(aes(x=N, ymin = min, ymax = max, fill = expr.name), data = atime.read.vary.cols$measurements, alpha = 0.5)+
theme(
text = element_text(size = 35),
axis.text = element_text(size = 20),
axis.title = element_text(size = 20)
) +
scale_x_log10("N = number of columns to read")+
scale_y_log10("Computation time (seconds)") +
scale_fill_manual(values = read.colors) +
scale_color_manual(values = read.colors)
directlabels::direct.label(gg.read.3, list(cex = 2, "top.polygons"))
dev.off()
the original issue was that some text on the figure (seconds=5) was smaller than other text, making it difficult to read. that issue persists. for example the readr direct label is much smaller than others, please fix by either removing the (lazy=TRUE) or giving more vertical space so that label is not reduced in size, relative to the others.
also currently width=15 and height=10 which will probably result in text that is too small to read in the context of a paper. please fix by reducing the overall figure size, which results in text which looks bigger when the figure is scaled to page width.
also in the new figure seconds=5 is gone, and so are the N= numbers in the direct labels, so I wonder if you need those to prove the point you are trying to make with this figure?
@tdhock I have decided to show the below for graphs for comparative benchmarking for R functions and packages performing similar task