dimo414 / bkt

a subprocess caching utility, available as a command line binary and a Rust library.
https://www.bkt.rs
MIT License
225 stars 13 forks source link

Mechanism for invalidating (parts of) the cache #5

Open dimo414 opened 2 years ago

dimo414 commented 2 years ago

An easy option is to add an --invalidate flag that, rather than retrieving the command simply deletes the cache key associated with the given command, like so:

$ bkt -- date
Sun 21 Nov 2021 11:17:03 AM PST

$ bkt -- date
Sun 21 Nov 2021 11:17:03 AM PST

$ bkt --invalidate -- date

$ bkt -- date
Sun 21 Nov 2021 11:17:13 AM PST

But it might be preferable to support some sort of more powerful invalidation, such as the ability to invalidate all calls to a given command, e.g.:

$ bkt -- date +%T
11:18:19

$ bkt -- date +%T
11:18:19

$ bkt --invalidate=date

$ bkt -- date +%T
11:18:32

However bkt doesn't currently have any way to introspect the cache like this, short of a linear search. That might be fine, but otherwise either the key space would need to be redesigned or some secondary index would need to be maintained.

Also --scope can be used to invalidate the cache, as long as all callers participate in using the right scope.

rrx commented 2 years ago

I like the idea of using the scope key for the invalidate. It's simple and predictable. I was also thinking of tagging a command run with multiple tags, that would provide some further granularity. Using --scope would be the same as tagging, just using a single tag.

--tag user=woof,group=cows
--invalidate user=woof
--invalidate group=cows

# if we have multiple tags, it's not clear if it's an AND/OR
--invalidate group=cows,user=woof
--invalidate group=sheep,user=woof

Perhaps --invalidate-or, --invalidate-and to be explicit, otherwise it's ambiguous. Maybe it can fail if --invalidate is specified with multiple tags with a message suggesting --invalidate-and/--invalidate-or. Just an idea.

I'm glad I found this project, I was planning to write pretty much exactly this, and in Rust.

dimo414 commented 2 years ago

Tags are an interesting idea; would you mind filing a separate FR with some more thoughts on that, e.g. how you might use them (beyond invalidation)?

Invalidation via the existing scoping mechanism is straightforward and easy, but it requires some form of collaboration amongst callers. For example if multiple callers are caching reads to a database and then one caller makes a cache-invalidating write they would need some way to notify the other callers to update their scope value. Obviously achievable (e.g. by storing the scope name in a file all callers share), but not straightforward.

Glad you like the project, it's something I've wanted to implement for a long time!

dimo414 commented 1 year ago

Related: the newly added --modtime flag provides another mechanism for invalidation.

dimo414 commented 1 year ago

Idea: reimplement scopes as a modtime file in the cache directory. Then scope-wide invalidation would simply involve touch-ing that file, avoiding the need for some sort of more bespoke cleanup mechanism. Obviously users can implement this themselves with --modtime, but it'd be nice to abstract it away from them via scopes.

This needs a little more thought and experimentation, but if someone's interested in playing around with it this would probably be a reasonable PR to pull together.