Closed adamreichold closed 4 years ago
@Stoeoef I am sorry to bother you with this but could you maybe do a point release containing this change? The improvement is significant for us but I would love to avoid a Git dependency instead of going via crates.io. Thank you for your help!
That's a fair request. I'll push out a new point release during the week. Thanks for mentioning this.
That took longer than desired, but it's done :) . Version 0.8.1 is published and contains the changes.
While the change looks innocent, an inline capacity of 32 pointers does seem to induce LLVM to move these iterators around using calls to libc's
memcpy
instead of plain instructions - at least when using the Rust 1.45.0 nightly on x86-64.Reducing the inline capacity to 24 avoids this effect and the associated call overhead but does not seem to trigger significantly more allocations. For our simulation model, this improves the number of time stamps per second by almost 5% (and reduces the
memcpy
calls from 5% CPU to negligible).