vorner / arc-swap

Support atomic operations on Arc itself
Apache License 2.0
778 stars 31 forks source link

Sharing an `ArcSwap` between multiple threads #38

Closed NeoLegends closed 3 years ago

NeoLegends commented 3 years ago

Hey,

we've been wondering what would be the best way to share access to an ArcSwap between multiple threads. The examples all seem to use scoped threads and references, but I'm not sure this would be applicable in our case (a concurrent, async web server).

We noticed that, when cloning an ArcSwap, the swap does not seem to propagate to the other, cloned instances (see this playground link). Is the correct solution thus to just Arc-wrap the ArcSwap itself? We've been wondering if that would negate the performance benefit of the ArcSwap since the CPUs would be contending around a refcount again, but this time when cloning the ArcSwap and not the inner value.

I, for one, expected this library to work pretty much exactly like what we'd normally use Arc<RwLock<Arc<T>> for, I'm sure there could also be some detail we're missing. I think this is also a case where the docs could be improved, it says at the start that it's supposed to work like that, but to newcomers it seems like there are some tiny but very important differences. Examples covering a few more use cases other than scoped threads could also help in the adoption of new users.

Thanks in advance for any guidance!

vorner commented 3 years ago

ArcSwap acts like RwLock<Arc<T>> not Arc<RwLock<Arc<T>> (it has only the „inner“ arc, not the outer one). So yes, wrapping it in an Arc, therefore having Arc<ArcSwap<T>> is one possible way.

You could also use other means, like having a global instance (using once_cell::sync::Lazy), or something like that.

How often do you expect to clone the outer Arc? I'd say you would want to do it once per connection (not once per request, you could keep it in the per-connection context). And setting up a TCP connection and all that will be a) not happening often enough to cause serious contention on the cache line, b) much higher overhead than even that contention. So I guess it shouldn't be a problem.

One tip, though: Don't keep the guard „loaded“ across yield points, otherwise you could run out of the „fast“ slots in high concurrency. Load it only once you have everything ready and you want to perform whatever computation you need with it.

Regarding docs and such:

vorner commented 3 years ago

I believe this was more of a question than an issue and it has been answered; closing.