Add load_if_changed to Cache to return a value only if it has changed

vorner / arc-swap

Support atomic operations on Arc itself

Apache License 2.0

746 stars 31 forks source link

Add load_if_changed to Cache to return a value only if it has changed #91

Open elichai opened 1 year ago

elichai commented 1 year ago

This is required when you need to know if you received a new value or an old one you've already seen

vorner commented 1 year ago

Hello

I'm not saying no, but can you explain the use case of this? Conceptually, a Cache is for having an up to date value. Why would you need to know it has changed?

elichai commented 1 year ago

@vorner The use case is that I have a program with a bunch of workers that all work on the same task (highly parallelizable tasks) and I use this to swap the task if a new task comes in, the workers need the actual T so they can mutate it while working on it, not Arc<T> so if the Arc<T> has changed I'll replace the current working task with a clone of the new T in the new Arc<T>

In other words, it looks something like that:

let mut task = watcher.get_task();
loop {
    task.id = ...;
    // work
    if let Some(new_task) = watcher.load_if_changed() {
        task = (*new_task).clone();
    }
}

vorner commented 1 year ago

The fact you need to clone the thing inside the Arc kind of suggests arc-swap is the wrong tool for the job, to be honest. The Arc is actually getting in your way there a little bit (or at least not serving any actual purpose).

What are your performance characteristics? Is the single iteration of the work so short that the performance of the synchronization primitive actually matters? If not, RwLock<T> (or ShardedLock from crossbeam) might do the thing, or having a channel into each worker and have the sender clone the things for them (try_recv on it in the worker), or maybe do something with crossbeam::epoch. Using Arc-Swap's Cache to not use the Arc seems a bit… weird 😇

elichai commented 1 year ago

Basically what I'm trying to emulate here is a very fast watch channel, I have workers that work on a very hot short tight loop, and after every few iterations they want to make sure they're working on the latest job, and the workers should never "wait" for a job. So with Arc-swap I'm using a sort of atomic CoW where each worker clones the data and the data is updated by a pointer atomic swap (in a GC language this is trivially implementable, with rust it requires something like arc-swap to make sure there are no free after free or use after free)

vorner commented 1 year ago

I think I understand what you do and somewhat of why you want to do it this way and why such API would help your case.

On the other hand, I'm still reluctant because there seem to be a lot of „this feels wrong on the design level“ smell… both for the API and the requirements… I wouldn't really feel very happy about leading other people towards similar solutions.

From your description, the right tool for your job seems to be the crossbeam::epoch, or an alternative, use something like „double-cache“ here. What I mean is:

Use the cache as it is, but add some kind of generation field to the job (incremented with each new job stored / wrapping_add(1)).
Check the generation of the job and clone as necessary. Something like:

let mut job_copy = ...;
loop {
    let central_job = cache.load();
    if central_job.generation != job_copy.generation {
        job_copy = central_job.clone();
    }
}