Open MeBrei opened 5 years ago
Some update on this. I have installed flamegraph which is a handy wrapper for dtrace
in Rust, which integrates with cargo
. After doing that, I have disabled optimisations in release
builds as suggested by this post, by adding:
# Set debug = true if you want to do profiling with flamegraph.
[profile.release]
debug = true
To the cargo.toml
. This produced the following graph:
It's not very large and hard to navigate if this is not an interactive image, but my current interpretation goes this way. There are 3 main "tips" in the graph: the one on the left in irrelevant to us as it's basically "benchmarking noise" from criterion
, and the other two basically dominates the whole CPU time (as you would expect), almost equally.
My interpretation is that rank_network
is actually the slower one but it's called less times, whereas we do plenty of calls to random_walk
throughout the benchmarking, and this is also witnessed by the osrank_naive-0e910a9a4ee537f4criterion::Bencher::iter::h9c169f50c1c9202f
which "retains" all those calls to random_walk
.
If we analise random_walk
by zooming into it, we can see the cost is dominated by clone calls (unsurprisingly), choose_weighted
and neighbours
calls:
As far as rank_network goes, instead, I might be wrong but it seems like the cost is dominated by the fraction manipulation and calls to fmt::write
and float_to_decimal_common_shortest
. My hunch here is that we are paying the price of f64 arithmetic while also converting from the Weight
type:
More investigation is needed, but I think that if we start by eliminating all the unnecessary calls to .clone
and we sort out our Osrank
representation, we should already land into a better place.
Investigate profiling/flamegraph for osrank_naive. Document slow parts of the code and if/how they can be improved.