karlseguin / ccache

A golang LRU Cache for high concurrency
MIT License
1.29k stars 120 forks source link

add SyncUpdates method to synchronize with worker thread, and use it in tests #60

Closed eli-darkly closed 3 years ago

eli-darkly commented 3 years ago

This is an attempt to address the test synchronization issue that was mentioned in this comment in a way that 1. doesn't rely on arbitrary sleeps and 2. can also be used by application test code. What I mean by 2 is: suppose an application wants to verify its own behavior in cases where items have been dropped from the cache— it won't be able to do that until it knows that the cache has caught up with updates, so currently it would have to use the same unreliable time.Sleep technique that the ccache tests are using now.

My approach was to add a SyncUpdates method to both Cache and LayeredCache. The semantics of this are that if a goroutine does a bunch of Get/Set calls and then calls SyncUpdates, that method doesn't return until the worker has consumed any items that existed in the promotables/deletables channels up to that point. It does this simply by doing non-blocking reads on those channels until they are at least momentarily empty (so, if other goroutines are in the middle of putting in items as well, those might or might not get picked up by SyncUpdates). That happens on the worker goroutine, which then reports completion to the calling goroutine via the same kind of mechanism that's already being used for Clear().

Another kind of update that can happen asynchronously is that if you reduce the cache size with SetMaxSize, excess items will be dropped by the worker goroutine. SyncUpdates also works as a way to wait on that result, because the message that tells the worker to change the cache size goes through the same control channel that the SyncUpdates message goes through.

I replaced all of the time.Sleep calls, in tests that were waiting for the worker to catch up, with cache.SyncUpdates() and I've been ~almost~ unable to reproduce test failures since then. ~There is one place where I do get intermittent failures, which probably means I've misunderstood something about the intended behavior— see PR comments.~

karlseguin commented 3 years ago

I'll review this later, but I like the idea. I actually think it might work well with my new Stop approach (see our original discussion and the linked commit there). The "new" Stop should call SyncUpdates to reduce the chance of killing the worker while work is pending (and thus blocking anything).

eli-darkly commented 3 years ago

Something I forgot to say is that if there were only a single channel that handled promotions and deletions and control messages, the implementation of SyncUpdates would be quite a bit simpler because handling the message wouldn't require doing anything at all, other than sending the "done" message back— the serialization of all the things over the channel would in itself guarantee that all the previous work had been done, and it would also impose an overall ordering between updates and deletes. You may have already considered that and decided against it (for instance, because distinguishing between promotions and deletions would require two types passed as interface{}, rather than just *Item, and that's less efficient) but I just wanted to mention it.