Closed jayhesselberth closed 4 years ago
I'm seeing a slowdown as well, but not as slow as your benchmarks. This result was using dev dplyr which has a large performance regression in arrange. See below for benchmarks with CRAN dplyr.
I am not sure what caused this result. On my end the benchmarks looks reasonable with the current master branch with dplyr 0.8.5. Perhaps the servers used by travis had a slowdown? Might be worth rebuilding the docs to see if it is a reproducible regression.
library(valr)
library(dplyr)
library(ggplot2)
library(tibble)
library(scales)
library(GenomicRanges)
library(microbenchmark)
genome <- read_genome(valr_example('hg19.chrom.sizes.gz'))
# number of intervals
n <- 1e6
# number of timing reps
nrep <- 2
seed_x <- 1010486
x <- bed_random(genome, n = n, seed = seed_x)
seed_y <- 9283019
y <- bed_random(genome, n = n, seed = seed_y)
res <- microbenchmark(
# randomizing functions
bed_random(genome, n = n, seed = seed_x),
bed_shuffle(x, genome, seed = seed_x),
# # single tbl functions
bed_slop(x, genome, both = 1000),
bed_flank(x, genome, both = 1000),
bed_shift(x, genome),
bed_merge(x),
bed_partition(x),
bed_cluster(x),
bed_complement(x, genome),
# multi tbl functions
bed_closest(x, y),
bed_intersect(x, y),
bed_map(x, y, .n = n()),
bed_subtract(x, y),
bed_window(x, y, genome),
# stats
bed_absdist(x, y, genome),
bed_reldist(x, y),
bed_jaccard(x, y),
bed_fisher(x, y, genome),
bed_projection(x, y, genome),
# utilities
bed_makewindows(x, win_size = 100),
times = nrep,
unit = 's')
# covert nanoseconds to seconds
res <- res %>%
as_tibble() %>%
mutate(time = time / 1e9) %>%
arrange(time)
# futz with the x-axis
maxs <- res %>%
group_by(expr) %>%
summarize(max.time = max(boxplot.stats(time)$stats))
# filter out outliers
res <- res %>%
left_join(maxs) %>%
filter(time <= max.time * 1.05)
#> Joining, by = "expr"
ggplot(res, aes(x=reorder(expr, time), y=time)) +
geom_boxplot(fill = 'red', outlier.shape = NA, alpha = 0.5) +
coord_flip() +
theme_bw() +
labs(
y='execution time (seconds)',
x='',
title="valr benchmarks",
subtitle=paste(comma(n), "random x/y intervals,", comma(nrep), "repetitions"))
Created on 2020-03-21 by the reprex package (v0.3.0)
The benchmark vignette now shows normal timings. Perhaps there was some isolated issue during that previous travis pkgdown build.
I'm not sure what happened, but these benchmarks are significantly slower than previous (most were <2 seconds). Can you confirm @kriemo?
https://valr.hesselberthlab.org/articles/benchmarks.html