Closed latacora-paul closed 1 year ago
Thanks for digging in and profiling!
I didn't intuitively understand the identity (perhaps due to unfamiliarity with the notation) until I thought about it more.
It's basically saying the intersection (i.e. what's in A and B):
Combined with the difference (i.e., what's only in B):
is just B:
The code looks good and it passes our tests (including property-based tests), so this seems like a solid change. I have a nagging worry there's some implementation detail that would make these two not equivalent.
I also had some uneasiness (why was it like that in the first place if it could be so simple?).
However, as far as I can tell the order of act-ks
doesn't matter since it's diffing maps that make no guarantees about entry order anyways. The passing tests gave me increased confidence in the safety of the change. @alysbrooks if there's something additional you think needs to be done just let me know :)
I think enough due diligence has been done here. @latacora-paul could you still add a CHANGELOG entry? then this is ready to go out.
could you still add a CHANGELOG entry?
done!
I will note that I didn't notice tests running any faster after this change, but there's a lot of noise on our tests and I don't think they involve many large maps. I did a quick test using criterium on my computer on a 100-element hash-map, and it was several times faster and somewhat faster for a small hash-map.
I was seeing some pretty bad performance when performing a large diff and tracked down the primary cause. I'm certain there are other improvements that can be made but I'll keep the change small for now :). This change runs ~15x faster for my use case.
Math identity illustrating why this change is okay:
Before:
After: