Closed rinde closed 2 months ago
In general I am open to redesigns that improve performance / usability as you've demonstrated here.
I'm going on an extended trip next week however, and will be gone until the end of May. I'd like to revisit this in detail then, since I can't devote the time to it this week. How does that sound?
I understand. That is unfortunate (for me) since I hoped to getting this fixed soon, but nice for you to go on a long trip 👍 . I will probably continue working in a separate PR and maybe create a temporary fork so that we can use it internally. Are you still planning a release before you leave that includes the arrays PR?
Have a nice trip!
I've just released 0.2.5
with the array improvements.
I will probably continue working in a separate PR and maybe create a temporary fork so that we can use it internally.
Please feel free, although let's definitely continue collaboration once I return. I am keen for a better design, and these next few weeks could be a good opportunity for you to test the new design internally.
Thank you for the release!
Please feel free, although let's definitely continue collaboration once I return. I am keen for a better design, and these next few weeks could be a good opportunity for you to test the new design internally.
Yes that is indeed my intent. I will create an improved version and test it internally. When you're back we can discuss my findings and see how we can best continue our collaboration to further improve the library.
Excellent, then I'll see you when I'm back! The reason I haven't released a 1.0
yet is that I wasn't 100% convinced of the design, and I'm glad I've let it stew for as long as I have. I've otherwise been using it in production for 6 months or so without issue.
Problem
As part of my work we do regular benchmarks and I found that replacing
Vec
withNEVec
somewhere caused a noticable slowdown. I started investigating and created some micro benchmarks. I found thatNEVec
is considerably slower thanVec
which seems to be related to how the non-empty iterator is implemented.Solution
I created an alternative implementation called
NEVec2
that only implements a couple methods for benchmarking.NEVec2
is a simple wrapper forVec
that can, by proper encapsulation guarantee non-emptiness. For the API ofNonEmptyIterator2
I went for the design that I also proposed in https://github.com/fosskers/nonempty-collections/pull/16. Based on my (limited) experience with this new design I think that it would allow to make a lot of code much simpler because it can forward a most of the calls directly to the underlyingVec
while still upholding the non-empty guarantee. I believe that this is also what keeps the performance on par with the underlyingVec
.Benchmark results
Results obtained with
cargo bench
:As can be seen above, the old
NEVec
performs up to 2x slower thanNEVec2
and plainVec
. The exception is when the data size is 1. This is becauseNEVec
keeps the first element in a local variable which means it is on the stack and not on the heap.I expect that similar performance drawbacks exist for the other datastructures.
Note that these numbers will change from run to run and from machine to machine. The conclusions, however, seem to be consistent.
Moving forward
I hope you are convinced that this warrants a change in the design of the crate.
For the future the
vec2.rs
file can be entirely discarded as I just created it for benchmarking purposes. I think it makes sense to adapt the internals of the datastructures and iterators such it works similar as invec2.rs
. Before I start on any of that I would like to hear your opinion of course 😄