RwLock::read() should be reentrant so its behavior will be the same as borrow checker

rust-lang / rust

Empowering everyone to build reliable and efficient software.

https://www.rust-lang.org

Other

96.92k stars 12.53k forks source link

RwLock::read() should be reentrant so its behavior will be the same as borrow checker #114770

Open ultimaweapon opened 1 year ago

ultimaweapon commented 1 year ago

Let's say I have the following code:

struct Foo {
    items: RwLock<Vec<Item>>,
}

impl Foo {
    pub fn bar(&self) {
        for i in self.items.read().unwrap().deref() {
            self.some_method_that_will_endup_calling_baz(i);
        }
    }

    fn baz(&self, id: u32) {
        self.items.read().unwrap().iter().find(|i| i.id() == id);
    }
}

If someone call bar it may cause a panic in baz because the items is already locked by bar. The above code will be working if items is not putting behind RwLock but I will not be able to push the item into items. If I lock the Foo instead of its field it will solve this problem but it will cumbersome to work with Foo if most of it fields is immutable.

sunmy2019 commented 1 year ago

RwLock::read() is indeed re-entrantable multiple reads.

See https://doc.rust-lang.org/std/sync/struct.RwLock.html#examples

ultimaweapon commented 1 year ago

I did not noticed the example on RwLock show it is re-entrantable. The problem is why the documentation on read method tell there might be a panic if the current thread already locked.

sunmy2019 commented 1 year ago

The problem is why the documentation on read method tell there might be a panic if the current thread already locked.

I guess it is copied from https://doc.rust-lang.org/std/sync/struct.Mutex.html#panics

and forgot to change.

joboet commented 1 year ago

RwLock is not always reentrant, because it is writer-preferring on many platforms. Consider the following order of operations:

Thread 1	Thread 2
read(lock)
	write(lock)
read(lock)

This is a deadlock, as the write attempt is ordered before the second read attempt, so the read will block on its completion; but the write attempt is blocked on the release of the first read-lock, which will not occur since thread 1 is blocked.

The choice of writer-preference was made as it simplifies implementations and leads to much better write latency when having many readers and few writers (this should be very common). Also, at least on Windows, std uses SRWLOCK internally, which mandates this.

ultimaweapon commented 1 year ago

The second lock on the Thread 1 don't need to acquire the lock again because the current thread already hold on the lock. It should be simply to return a reference to the value without trying to acquire the lock.

bjorn3 commented 1 year ago

It needs to increase the read lock count to ensure it doesn't get unlocked before all read lock guards are gone. And on Windows this read lock count increasing will block waiting on any writers as SRWLOCK is writer preferring.

m-ou-se commented 6 months ago

The second lock on the Thread 1 don't need to acquire the lock again because the current thread already hold on the lock. It should be simply to return a reference to the value without trying to acquire the lock.

That means we'd need to keep track of which threads have read-locked the rwlock, instead of just the total number of readers. That's not something we track today. We'd need to keep a lot of additional data to do that, resulting in a lot of overhead.

ultimaweapon commented 6 months ago

Could you elaborate what additional data that need to keep? AFAIK it should be only one TLS field that need to be added to RwLock to tell how many read-locked for the calling thread. If this counter is non-zero we can just increase it and return the lock guard without trying to acquire the underlying lock.

m-ou-se commented 6 months ago

Accessing a thread local variable is a significant amount of overhead compared to just a single atomic operation to lock/unlock. Also, creating a new thread local variable for every RwLock can also be a significant amount of overhead, depending on the TLS implementation.