vorner / arc-swap

Support atomic operations on Arc itself
Apache License 2.0
778 stars 31 forks source link

Document clone behavior of `ArcSwapAny` #24

Closed dbrgn closed 4 years ago

dbrgn commented 4 years ago

I'm not sure if this should be self-evident, but I struggled for a few hours to make my code correct, until I realized that ArcSwap itself does not behave like an Arc and that it will copy the state when cloning. (I'm not sure where this idea came from, probably because it has Arc in its name and because it wraps an Arc, so I assumed the clone semantics to be like Arc.)

If you don't think this change is needed, feel free to close the PR!

dbrgn commented 4 years ago

Code that made me realize my mistake:

println!("Plain cloned");
let w1 = ArcSwap::from_pointee(0usize);
println!("w1 points to: {:?}", Arc::into_raw(w1.load().clone()));
println!("w1 points to: {:?}", Arc::into_raw(w1.load().clone()));
let w2 = w1.clone();
println!("w2 points to: {:?}", Arc::into_raw(w2.load().clone()));
println!("w2 points to: {:?}", Arc::into_raw(w2.load().clone()));
println!("Swap");
w1.swap(Arc::new(1usize));
println!("w1 points to: {:?}", Arc::into_raw(w1.load().clone()));
println!("w2 points to: {:?}", Arc::into_raw(w2.load().clone()));
println!();

println!("Arc cloned");
let w1 = Arc::new(ArcSwap::from_pointee(0usize));
println!("w1 points to: {:?}", Arc::into_raw(w1.load().clone()));
println!("w1 points to: {:?}", Arc::into_raw(w1.load().clone()));
let w2 = w1.clone();
println!("w2 points to: {:?}", Arc::into_raw(w2.load().clone()));
println!("w2 points to: {:?}", Arc::into_raw(w2.load().clone()));
println!("Swap");
w1.swap(Arc::new(1usize));
println!("w1 points to: {:?}", Arc::into_raw(w1.load().clone()));
println!("w2 points to: {:?}", Arc::into_raw(w2.load().clone()));

This prints:

Plain cloned
w1 points to: 0x55841c260cb0
w1 points to: 0x55841c260cb0
w2 points to: 0x55841c260cb0
w2 points to: 0x55841c260cb0
Swap
w1 points to: 0x55841c260e00
w2 points to: 0x55841c260cb0

Arc cloned
w1 points to: 0x55841c260e20
w1 points to: 0x55841c260e20
w2 points to: 0x55841c260e20
w2 points to: 0x55841c260e20
Swap
w1 points to: 0x55841c260e60
w2 points to: 0x55841c260e60
vorner commented 4 years ago

Hello

Improving documentation is indeed nice thing to do. But I think this doesn't really fit into this content. The intention of that paragraph is that you can't use it „directly“. Eg. if you have Arc<i32>, you can do *a = 42, but you can't do that with ArcSwap. So if this is needed, it probably belongs somewhere else.

Nevertheless, I think this does act in a similar way as Arc. Eg. similar code would be something like:

let w1 = Arc::new(42);
let mut w2 = w1.clone();
println!(...)
let w2 = Arc::new(53);
println!(...)

The second part is somewhat similar to doing Arc<Arc<...>>. Can you explain somehow better where the difference or the surprise comes from?

dbrgn commented 4 years ago

Hm, I'm not sure whether I understand your comment correctly. I'll try to explain what confused me. I hope I don't mix up any terminology.

Can you explain somehow better where the difference or the surprise comes from?

An Arc<T> has pointer semantics. So if I have an Arc that points to an object at memory address 0x01 and I clone that, I'll have a second Arc that points to the same memory address 0x01. If the second Arc changes the value behind that pointer, this will be reflected in the first Arc as well, when accessing the contained value. Example: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=0b2ed5679df7ffc96ce6fabba2e3f4f9


use std::sync::{Arc, Mutex};

fn main() {
    let a1 = Arc::new(Mutex::new(1));
    let a2 = a1.clone();
    assert_eq!(*a1.lock().unwrap(), 1);
    assert_eq!(*a2.lock().unwrap(), 1);
    *a1.lock().unwrap() = 2; // Modify contents
    assert_eq!(*a1.lock().unwrap(), 2);
    assert_eq!(*a2.lock().unwrap(), 2);
}

However, with ArcSwap, cloning an instance will not result in "just another pointer to the same memory address". It will result in second ArcSwap that happens to hold a copy of the same Arc. But if you swap out the contained Arc with a new Arc, then the original ArcSwap will not have changed.

use std::sync::{Arc, Mutex};

use arc_swap::ArcSwap;

fn main() {
    // Create an ArcSwap containing a Mutex
    let a1 = ArcSwap::new(Arc::new(Mutex::new(1)));

    // Create a clone
    let a2 = a1.clone();

    // Both contain the same data...
    assert_eq!(*a1.load().lock().unwrap(), 1);
    assert_eq!(*a2.load().lock().unwrap(), 1);

    // Modify data in a1
    *a1.load().lock().unwrap() = 2;

    // Both a1 and a2 point to the same `Arc`,
    // so the value will have changed in both cases
    assert_eq!(*a1.load().lock().unwrap(), 2);
    assert_eq!(*a2.load().lock().unwrap(), 2);

    // Now, swap data in a2
    a2.swap(Arc::new(Mutex::new(3)));
    assert_eq!(*a2.load().lock().unwrap(), 3);

    // However, a1 still points to the old Arc
    assert_eq!(*a1.load().lock().unwrap(), 2);
}

If you (erroneously) think of ArcSwap as a pointer, then this is very confusing, because cloning creates a new, independent copy of the contained value.

But of course ArcSwap is just a holder for the pointer (a value type), and calling swap will replace the Arc with a new, independent Arc without affecting other ArcSwap instances.

In my case, I probably achieved at the wrong metal model because the ArcSwap has Arc in its name, because from_pointee hides the Arc on creation, and because cloning the ArcSwap will result in another instance that initially holds another pointer to the same contained value (so modifying one of them will affect the other instance).

It all makes sense of course when thinking about it a bit more, but maybe the documentation can be improved in a way that makes this misunderstanding a bit less likely :slightly_smiling_face: If you don't think this is necessary, feel free to close this PR.

In any case, arc-swap is a wonderful and very useful library, thanks for maintaining it!

vorner commented 4 years ago

I think I see where you are coming from, I still think that ArcSwap does act consistent with the very first claim in The library provides a type that is somewhat similar to what RwLock<Arc<T>> is or Atomic<Arc<T>> would be if it existed.

I'm not against improving the documentation. I just don't think your changes would really help (I'm afraid this could confuse people more than help them and honestly, when in the state of confusion you describe, this one sentence would be easy to overlook).

Maybe can you come with some example for the documentation instead? Something in the lines of what you do above, but maybe with something easier to follow (it's easy to either get lost in the numbers and pointers or just skim it as not important, so it would be great if it had some actual „meaning“ or resembled a real use… something for the brain to catch on while not really going into the deep thinking mode 😇).

dbrgn commented 4 years ago

I'm not against improving the documentation. I just don't think your changes would really help

Sure, no problem! I'll close this for now, and if I can come up with a better explanation I can open a new PR.

jeromegn commented 4 years ago

For what it's worth, I had the same mental model at first. Then I ended up wrapping my ArcSwap in an Arc. It works, but I feel like I'm not using this crate right. I think I'm probably looking for a different crate that provides a type similar to Arc<RwLock<T>> (or just do that manually, really.)

vorner commented 4 years ago

Hmm, that's two people already confused by that. I've tried to put something together in #33. Can you have a look if it would have helped?

Alternatively, it seems RwLock can't be cloned at all, so the ability to clone could be removed in the next breaking version ‒ and one would be forced to either wrap it in Arc or create a new ArcSwap explicitly, which would probably prevent the confusion. But I wonder if it might limit the usage somehow ‒ hopefully not.

What do you think?

jeromegn commented 4 years ago

I think the comment you added in that PR would have helped me. I looked through the docs for any mention of that (as well as issues, that's how I found this issue) but couldn't find any. Cloning is often very opaque, you don't always know if it's going to be cheap or expensive and if you're going to be referencing the same data or if it's going to be a "copy" of the data. Because of that, I think documenting clone's behavior for a struct is always helpful!

It still feels weird to wrap the ArcSwap in an Arc. In practice, let foo = Arc::new(ArcSwap::new(Arc::new(bar))) looks redundant. That's the "smell" that made me wonder if I was using the right pattern. I feel like I could achieve the same with:

let foo = Arc::new(RwLock::new(0));
*foo.write() = 1;

Is ArcSwap more performant than a parking_lot RwLock? If so, then should I replace most of my Arc<RwLock<T>> with Arc<ArcSwap<T>> if they don't write too much? I have a "config" I'm swapping (writing) at most every 1 second, but I'm loading (reading from) it thousands of times per second. The docs specify this use case exactly (but doesn't mention read/write frequency), so I figure this is "right". I'm wrapping it in a State struct that is essentially a "view" of the config. Since it uses an ArcSwap, my code never needs mut access to it, I suppose that would make it a good candidate for a global which I wouldn't have to clone (I'm currently passing it to various functions).

I do like that ArcSwap appears to work well crossing async/await boundaries (compared to parking_lot RwLock).

dbrgn commented 4 years ago

Can you have a look if it would have helped?

Yes, those new docs would certainly have helped!

vorner commented 4 years ago

Thanks for the fast answer, I'll merge and piggy back the change with the next release.

Is ArcSwap more performant than a parking_lot RwLock? If so, then should I replace most of my Arc<RwLock> with Arc<ArcSwap> if they don't write too much?

Yes, it is faster (somewhat). The repository contains benchmarks (see the background.rs benchmarks), so you can run them yourself. But it's more like RwLock<Arc<T>>, not RwLock<T>. Besides, the example you show would be better with AtomicUsize.

Anyway, the point of ArcSwap is not just the performance, it's not blocking the readers, ever. When you hold write lock on RwLock, readers have to wait their turn. I think your use case is quite in line what ArcSwap is good with, so you might want to give it a try.

jeromegn commented 4 years ago

But it's more like RwLock<Arc>, not RwLock

Right, but the way I'm using it, it has a behavior closer to Arc<RwLock<T>>. Except, I know, the value is wrapped with an Arc and allocated on the heap.

Besides, the example you show would be better with AtomicUsize.

Yes of course, just quick code to explain what I meant.

the point of ArcSwap is not just the performance, it's not blocking the readers, ever.

Ah, so I'd want to use a RwLock if I need to block readers when I'm updating my inner value. That makes sense.

Reading the docs further helped me understand the performance characteristics of ArcSwap (and I learned about Cache, which might be useful to us).

Thanks!

vorner commented 4 years ago

Ah, so I'd want to use a RwLock if I need to block readers when I'm updating my inner value. That makes sense.

I think that the blocking is more like a side effect/downside of RwLock than a feature, but yes. The idea here is that you create a completely new value behind a new allocation, then switch the pointer ‒ and new reads get the new values, older reads just keep the old value and once they are done, the old value dies.

jeromegn commented 4 years ago

IMO, it's sometimes a feature and both have their use. Sometimes I want to prevent any reader from getting a new value until my writer is done (in highly consistent scenarios).

This is not the case for my config, I'll happily use ArcSwap! I'll try to find other places where I could be using one too.