rust-lang / hashbrown

Rust port of Google's SwissTable hash map
https://rust-lang.github.io/hashbrown
Apache License 2.0
2.42k stars 283 forks source link

Implement XxxAssign operations on HashSets #529

Closed ToMe25 closed 4 months ago

ToMe25 commented 4 months ago

This PR primarily implements the XxxAssign operation traits for HashSet.

My primary motivation to do so is for convenience, but depending on the situation they can provide a significant performance improvement as well.*

In my tests, which may not be ideal because I don't have much benchmarking experience, the assigning operations are, with the exception of Sub, a minimum of 25% faster.*
Note that when swapping the large and the small set around, some of these are significantly slower than the non-assigning variant.
Therefore using them is likely only worth it performance wise, if you already know which set is larger, and the right one of the sets just so happens to be the one you don't need to keep.

* Results may have changed due to #530 being merged

Here my exact benchmark results, done with the newly added benchmark suit:

<!DOCTYPE html>

VER LSIZE SSIZE OP NS/ITER DIFF (%) COMMENT
1 1000 100 and 5,682.88    
1 1000 100 or 41,427.82    
1 1000 100 xor 57,404.27    
1 1000 100 subls 56,262.53    
1 1000 100 subsl 751.42    
1 1000 2 and 100.16    
1 1000 2 or 40,435.09    
1 1000 2 xor 59,058.05    
1 1000 2 subls 58,668.34    
1 1000 2 subsl 18.89    
1 1000 100 or_ass 32,888.49 -20.61% unconditional insert
2 1000 100 or_ass 29,397.04 -29.04% !contains insert
3 1000 100 or_ass 32,399.65 -21.79% extend iter().cloned()
4 1000 100 or_ass 30,693.33 -25.91% get_or_insert_owned
5 1000 100 or_ass 33,722.59 -18.60% calc intersection; extend rhs.iter() !intersection contains; Requires S: Clone
1 1000 100 add_ass 30,114.17 -26.66% !contains insert
1 1000 100 xor_ass 32,309.85 -43.72% contains remove else insert
2 1000 100 xor_ass 40,058.48 -30.22% extract_if rhs contains; extend !removed contains
3 1000 100 xor_ass 31,801.04 -44.60% raw_entry().from_key() replace_entry_with / insert
4 1000 100 xor_ass 31,935.07 -44.37% raw_entry().from_key_hashed_nocheck() replace_entry_with / insert_hashed_nocheck
5 1000 100 xor_ass 31,843.33 -44.53% self.map.table.get.is_none self.map.table.insert else self.map.table.remove_entry
1 1000 100 subls_ass 33,366.13 -40.70% contains remove
1 1000 100 subsl_ass 10,686.02 1322.11% contains remove
2 1000 100 subls_ass 36,351.69 -35.39% retain !contains
2 1000 100 subsl_ass 3,939.67 424.30% retain !contains
3 1000 100 subls_ass 32,012.82 -43.10% unconditional remove
3 1000 100 subsl_ass 9,908.76 1218.67% unconditional remove
4 1000 100 subls_ass 36,232.13 -35.60% self.map.retain !contains
4 1000 100 subsl_ass 3,939.35 424.25% self.map.retain !contains
5 1000 100 subls_ass 31,879.32 -43.34% if rhs smaller self unconditional remove else retain !contains
5 1000 100 subsl_ass 3,946.98 425.27% if rhs smaller self unconditional remove else retain !contains
1 1000 2 add_ass 28,324.95 -29.27%  
2 1000 2 or_ass 28,322.62 -29.96%  
1 1000 2 xor_ass 29,107.31 -50.71%  
3 1000 2 xor_ass 29,026.82 -50.85%  
1 1000 2 subls_ass 29,310.04 -50.04%  
1 1000 2 subsl_ass 4,212.56 22200.48%  
2 1000 2 subls_ass 34,074.85 -41.92%  
2 1000 2 subsl_ass 66.43 251.67%  
3 1000 2 subls_ass 29,340.86 -49.99%  
3 1000 2 subsl_ass 5,972.25 31515.93%  
5 1000 2 subls_ass 29,460.49 -49.78%  
5 1000 2 subsl_ass 65.32 245.79%  

In addition to the Assigning operators this PR changes a few more things:

ToMe25 commented 4 months ago

I have one more optimization idea, which I will try tomorrow.

ToMe25 commented 4 months ago

I honestly don't think any optimizations I will think of will cause a significant improvement at this point, so I'll stop trying around now :)

Amanieu commented 4 months ago
* It removes the `A: Allocator` generics bound from `BitOr` and `BitAnd`.
  It doesn't seem to be used for anything, and `Xor` and `Sub` already don't have it.
  I'd be fine with reverting that and adding it to the other operators as well, I just thought having it inconsistent was weird.

Actually it should be the other way around: these traits should be implemented for HashSet with any A, not just Global (which is the default if you omit it).

I checked the standard library and it has the same issue. This should be fixed once hashbrown is updated in the standard library.

* It adds and implementation for `Add` for convenience. This does the same as `BitOr`, so let me know if I should remove it.

I don't think it makes sense to add this just for convenience. It would introduce confusion since now you have 2 operators that do the same thing.

ToMe25 commented 4 months ago

Whoops, I forgot default generics are a thing and thought "No specified bound == No bound".
I'll add the Allocator bound everywhere and remove Add and AddAssign then.
I might not get around to that today, but if not I'll do it tomorrow.
Would adding "Add" as an alias to "BitOr" be fine, if libraries that aren't STD are allowed to do that?

Amanieu commented 4 months ago

I would prefer if Add was not implemented at all.

ToMe25 commented 4 months ago

I meant a doc alias, but thats fine too, then I just wont.

Amanieu commented 4 months ago

A doc alias wouldn't work well since it would only trigger if you specifically search for "add". It is unlikely you are specifically looking for set union in that case: you would most likely be searching for arithmetic addition.

ToMe25 commented 4 months ago

Fair enough, if one purposefully searches the docs one will probably notice that for a set BitOr and Add would be the same soon anyway.

ToMe25 commented 4 months ago

Oh btw the non-assign performance would have changed due to #530 being merged, do I have to re-test them all or is it fine to leave the PR description as is?

Amanieu commented 4 months ago

It's fine to leave the benchmark results as they are, but the rest of the description will need updating.

ToMe25 commented 4 months ago

Yes, of course, once I implemented the changes I will go over everything in the description and adjust it to the changes in the PR. Its just that due to this PR not being currently on top of master benchmarking the changed non-assign ops vs the assign ops would be somewhat time consuming, so I wanted to avoid that if possible.

ToMe25 commented 4 months ago

I implemented the changes I said I would, and rebased because of the recent CI fixes.

Though now I wonder: Would it be better to change the A bound to Allocator + Default and return a new set with the same type of allocator as the input sets?
Right now the resulting set always uses the default allocator. Alternatively it would also be possible to change the bound to Allocator + Clone and use a clone of the self allocator.

ToMe25 commented 4 months ago

I believe that now would be the best time to add a restriction to Allocator + Default, because right now that would only be a new restriction for two of the traits, while the other two would still be more relaxed, just not as much.

Another question would be whether set operations between sets with different allocators should be allowed. If yes it might make sense to only restrict the self allocator to Allocator + Default and leave the other allocator at Allocator. Though I would split this change into a separate PR, if it is wanted at all.

One other thing is that while I generally intend to port my recent hashbrown PR's to STD(atleast those which result in behavior changes), I wont have the time to do so in the next 3-4 weeks.

Amanieu commented 4 months ago

Another question would be whether set operations between sets with different allocators should be allowed. If yes it might make sense to only restrict the self allocator to Allocator + Default and leave the other allocator at Allocator.

I don't think we should do this because then the operation becomes too generic and the compiler can't infer the correct allocator type to use.

Though now I wonder: Would it be better to change the A bound to Allocator + Default and return a new set with the same type of allocator as the input sets? Right now the resulting set always uses the default allocator. Alternatively it would also be possible to change the bound to Allocator + Clone and use a clone of the self allocator.

Allocator + Default seems like the best way forward.

ToMe25 commented 4 months ago

Ok, done, I implemented the Allocator + Default change for the non-assigning set ops.

Amanieu commented 4 months ago

@bors r+

bors commented 4 months ago

:pushpin: Commit 481ef396bf7c45c66adb42f00590a242085ddb1c has been approved by Amanieu

It is now in the queue for this repository.

bors commented 4 months ago

:hourglass: Testing commit 481ef396bf7c45c66adb42f00590a242085ddb1c with merge f0eece41a0e46145692adb0bc9627b01f95ec4a5...

bors commented 4 months ago

:sunny: Test successful - checks-actions Approved by: Amanieu Pushing f0eece41a0e46145692adb0bc9627b01f95ec4a5 to master...