georust / rstar

R*-tree spatial index for the Rust ecosystem
https://docs.rs/rstar
Apache License 2.0
410 stars 68 forks source link

Fine-tune iterator inline capacity to avoid memcpy call overhead #39

Closed adamreichold closed 4 years ago

adamreichold commented 4 years ago

While the change looks innocent, an inline capacity of 32 pointers does seem to induce LLVM to move these iterators around using calls to libc's memcpy instead of plain instructions - at least when using the Rust 1.45.0 nightly on x86-64.

Reducing the inline capacity to 24 avoids this effect and the associated call overhead but does not seem to trigger significantly more allocations. For our simulation model, this improves the number of time stamps per second by almost 5% (and reduces the memcpy calls from 5% CPU to negligible).

adamreichold commented 4 years ago

@Stoeoef I am sorry to bother you with this but could you maybe do a point release containing this change? The improvement is significant for us but I would love to avoid a Git dependency instead of going via crates.io. Thank you for your help!

Stoeoef commented 4 years ago

That's a fair request. I'll push out a new point release during the week. Thanks for mentioning this.

Stoeoef commented 4 years ago

That took longer than desired, but it's done :) . Version 0.8.1 is published and contains the changes.