Open krlmlr opened 5 days ago
I wasn't able to get to this today :cry:
seq(1, vcount(graph))
does not work for empty vectors.
Alternatives are: seq_along, seq_len.
library(bench)
n <- 1
graph <- make_ring(n)
bench:: mark(
"seq" = seq(1, vcount(graph)),
"along" = seq_along(V(graph)),
"len" = seq_len(vcount(graph))
)
The above fails if n = 0.
vcount()
is faster then V(g)
.
Yeah, my seq_numeric()
would special-case for the n == 0
edge case.
What is the point of a special function seq_numeric()
?
Isn't it simpler to replace seq(1, x)
with seq_len(x)
?
I also wonder: how bad is it if we omit the test index_is_natural_sequence()
? How likely is it that the test does indeed pass?
seq(1, x)
is much faster, as the benchmark shows.
Because the natural sequence is the default argument for many input functions, I suspect that this case is pretty frequent and worth optimizing for, but I haven't done any measurements.
Your benchmark seems to show that seq_len
(246 ns) is much faster than seq
(791 µs), exactly the opposite of what you say above @krlmlr. What am I missing?
🤦 🤦
You're right, of course. But the test isn't representative either:
bench::mark(
sum(seq(1, 10000000, 1)),
sum(as.numeric(rlang::seq2(1, 10000000))),
sum(as.numeric(seq_len(10000000)))
)
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 3 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:t> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 sum(seq(1, 1e+07, 1)) 77.6ms 93.34ms 10.6 191MB 16.0
#> 2 sum(as.numeric(rlang::seq2(1, 1… 902ns 1.02µs 800320. 0B 0
#> 3 sum(as.numeric(seq_len(1e+07))) 164ns 246.04ns 2861548. 0B 0
Created on 2024-07-08 with reprex v2.1.0
The second and third example probably uses Gauss's formula. Let's try var()
:
bench::mark(
var(seq(1, 100000, 1)),
var(as.numeric(rlang::seq2(1, 100000))),
var(as.numeric(seq_len(100000)))
)
#> # A tibble: 3 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch> <bch:> <dbl> <bch:byt> <dbl>
#> 1 var(seq(1, 1e+05, 1)) 818µs 960µs 883. 1.92MB 32.0
#> 2 var(as.numeric(rlang::seq2(1, 1e+05… 434µs 496µs 2002. 781.3KB 26.9
#> 3 var(as.numeric(seq_len(1e+05))) 432µs 495µs 2016. 781.3KB 29.4
Created on 2024-07-08 with reprex v2.1.0
This is more realistic but also indicates that as.numeric(seq_len())
is faster. I stand corrected.
A thought: avoid seq() in any case.
I tried make_ring(1E5)
. Optimization saves ca. 50%, 3.72ms v.s. 6.42ms.
What happens, and what did you expect instead?
https://github.com/igraph/rigraph/pull/1330#discussion_r1665753951
We also need the
by
argument to ensure that the output is a numeric.Perhaps add a function
seq_numeric()
that has a special case for zero-length output?To reproduce
seq(by = 1)
seems to be the fastest, look at the units. Maybe it's worth arguing with R core to supportseq(start, start - by, by)
without error and to return an empty vector in this case?Created on 2024-07-05 with reprex v2.1.0
System information
No response