Open Robinlovelace opened 4 years ago
I added this to milestone 3 (our last milestone), such that we can do the benchmarking towards the end of the project when the core of the code is finished, and it is time to finetune.
Is it ok if I assign you for this @Robinlovelace ?
Sure I'm up for that. Will be good to generate some consistent benchmarks, I'll start by looking for other open network datasets used by other projects for benchmarking.
Benchmarking of routing performance will depend entirely what/which you're optimising for:
OSRM for instance is fast for massive networks but is less optimal once you want rapidly-changing live traffic data, as the ability to do up-front optimisation is lowered.
One approach to continuous benchmarking is this: https://github.com/r-lib/bench#continuous-benchmarking
Thoughts @agila5, @loreabad6 and @luukvdmeer ? Worth a try I guess but could be overly complex compared with reporting benchmarks in README with each build manually.
Good news so far: sfnetworks
seems to be faster at creating spatial objects, even though the object sizes are larger:
library(sfnetworks)
system.time({
net = as_sfnetwork(roxel)
})
#> user system elapsed
#> 0.062 0.001 0.062
system.time({
net2 = stplanr::SpatialLinesNetwork(roxel)
})
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 7.0.0
#> Warning in SpatialLinesNetwork.sf(roxel): Graph composed of multiple subgraphs,
#> consider cleaning it with sln_clean_graph().
#> user system elapsed
#> 0.859 0.020 0.879
pryr::object_size(net)
#> Registered S3 method overwritten by 'pryr':
#> method from
#> print.bytes Rcpp
#> 807 kB
pryr::object_size(net2)
#> 447 kB
res = bench::press(n = seq(from = 10, to = nrow(roxel), length.out = 5),
{
bench::mark(
check = FALSE,
time_unit = "ms",
stplanr::SpatialLinesNetwork(roxel[1:n, ]),
sfnetworks::as_sfnetwork(roxel[1:n, ])
)
}
)
#> Running with:
#> n
#> 1 10
ggplot2::autoplot(res)
Created on 2020-06-22 by the reprex package (v0.3.0)
Heads-up, I've added continuous benchmarking in #64 but the build is failing due to credentials issues. That should be an easy fix. See here for details: https://github.com/r-lib/bench/issues/87
Any ideas of what else we should benchmark?
Hi and thanks for your work! IMO, for the moment, it's good enough since I think we should focus on testing the current functionalities, fix the bugs and then optimize the code and benchmark different implementations considering also what @mvl22 said. Let's keep this issue open for the time being.
Did you understand why the build is failing? Sorry but I have literally 0 experience with Github Actions and benchmarks.
Did you understand why the build is failing?
No, I'm not sure why the benchmarks are failing. One consideration: wonder if it's worth adding an optional edge_lengths
parameter in as_sfnetworks()
which could be FALSE
by default.
One consideration: wonder if it's worth adding an optional edge_lengths parameter in as_sfnetworks() which could be FALSE by default.
IMO yes if the network is created with explicit edges since I've always used the edge lengths during the analysis after creating the network.
Heads-up @Robinlovelace . Since lately the continuous benchmarking is failing. Whenever you find the time could you take a look? For me it is a mystery ;-)
Hi @luukvdmeer yes will do. Do we want to benchmark any other things?
Seems it has benchmarked things historically:
setwd("~/wip/sfnetworks/")
bench::cb_fetch()
d = bench::cb_read()
bench::cb_plot_time(d)
#> Loading required namespace: ggplot2
#> Loading required namespace: tidyr
Created on 2020-11-05 by the reprex package (v0.3.0)
Just tried this locally and it worked with no errors:
bench::cb_run()
Not 100% sure how it works either. I have checked here https://github.com/r-lib/bench/actions?query=workflow%3A%22Continuous+Benchmarks%22 and cannot see build logs there either. The examples above show how to read benchmarks saved in the past, would be useful to have a date.
TBH I do not fully understand continuous benchmarking. We could add simple benchmarks to a vignette instead. Thoughts @luukvdmeer ?
I have advocated for better documentation on the 'CB' approach in https://github.com/r-lib/bench/issues/87 but while we're waiting for that we could change tack.
I agree. Lets for now disable it until it matures. I saved the bench setup as we had it in a branch named bench
. We can re-add that content later.
Great thinking. I can add something on benchmarking using system.time()
- which vignette though?
How about starting with some basic benchmarking (the currently existing ones) in a "Benchmarks" section in the README? Once we have more coverage of other functionalities (or if you already have them) we could dedicate a new vignette to it, focused only on benchmarking.
For another project I've done some benchmarks and it seems that
sfnetworks
is already pretty fast. Wonder if we can make it even faster!Created on 2019-11-29 by the reprex package (v0.3.0)