HypothesisWorks / hypothesis

Hypothesis is a powerful, flexible, and easy to use library for property-based testing.
https://hypothesis.works
Other
7.39k stars 578 forks source link

Cache fixes #4018

Open jobh opened 1 week ago

jobh commented 1 week ago

Fix some issues with the cache implementation. No actual bugs, only latent issues and cleanup.

[edit]

jobh commented 1 week ago

If you dislike the new pin semantics, a fairly straightforward alternative is to allow "sticky" pinning of unknown keys by storing them as not_set. I.e., it's the key that's pinned, even if it's not set; and the entry will start out pinned if it's added later. I see some beauty in this approach, but it's perhaps a bit unconventional.

Zac-HD commented 1 week ago

Unfortunately it might be a week or two before I have space to reload context and review this properly, but if e.g. @tybug wants to review this I'm happy to confirm and merge.

(also, fantastic that we have coverage again - it's already catching an untested branch 🥳)

tybug commented 6 days ago

will check this in the next few days!

tybug commented 3 days ago

Am I correct in understanding that with the new pin semantics, if we pin a key that has been evicted (and don't pass value), we will get a KeyError? This feels like a bit of a footgun.

I would say your proposed sticky semantics solves these concerns and would actually advocate for that. It makes me a bit nervous in its fanciness but I can't think of a failure case?

Other changes look good 👍. I added one comment where I had to stop and think about a condition, but feel free to remove if you think it's clear without.

jobh commented 2 days ago

Thanks for the review @tybug!

  • old footgun: if a key is evicted before you pin it, and you later re-cache that key, you think it's pinned but the cache doesn't, so if you unpin it you get a surprise error
  • new footgun: if you forget to use the atomic option, or can't because you only have the key, you may get a fatal KeyError under cache pressure

Yes, this is correctly understood, and a nice explanation of the problem.

I pushed a change now to just require the value to be supplied to pin(). It is a bit unsatisfying, being less orthogonal and more repetitive API-wise. But on balance I think it's wiser than the sticky semantics, they do add a little complexity which we don't really need in our internal usage (we always have the value).

(I have implemented the sticky semantics too, so it's no extra work to switch tough)

jobh commented 1 day ago

Note, I added a minor change to always compare sort_key instead of score, and to remove a bad comment and associated no-cover pragma.

jobh commented 1 day ago

Note, I added a minor change to always compare sort_key instead of score, and to remove a bad comment and associated no-cover pragma.

It turned out this was a real bug (heap-property violation), so I spent some more time understanding and fixing the heap balancing. The stateful test now checks the heap property at each step, so I'm fairly confident about the change. Sorry for doing this post-review.