rust-osdev / spinning_top

A simple spinlock crate based on the abstractions provided by the `lock_api` crate.
Apache License 2.0
37 stars 4 forks source link

Reviews appreciated! #2

Closed phil-opp closed 4 years ago

phil-opp commented 4 years ago

This crate is quite new and hasn't been reviewed by anyone yet. I would appreciate any reviews on the code especially on the unsafe implementation of the RawMutex trait:

https://github.com/rust-osdev/spinning_top/blob/5e78baa35e2a7041b68ffa95796664ff722beb64/src/spinlock.rs#L29-L57

Amanieu commented 4 years ago

It looks fine to me. The code is fairly straightforward and the unsafe is satisfied since your implementation does indeed provide mutual exclusion.

phil-opp commented 4 years ago

@Amanieu Thanks a lot for the review!

gendx commented 4 years ago

The use of Ordering::Relaxed in two places doesn't seem straightforward to me. It would be nice to document better why this is correct e.g. with supporting references to well-established code or papers.

phil-opp commented 4 years ago

@gendx Thanks for the review!

The compare_exchange code was taken from https://github.com/Amanieu/parking_lot/blob/fa294cd677936bf365afa0497039953b10c722f5/lock_api/src/lib.rs#L49-L53. See https://github.com/Amanieu/parking_lot/pull/207#issuecomment-575869107 for the reason why compare_exchange is preferable to compare_and_swap.

See the C++ reference for how the relaxed ordering behaves: https://en.cppreference.com/w/cpp/atomic/memory_order#Relaxed_ordering. As far as I understand it, the compiler is not allowed to move the load operation outside of the loop because an atomic can be changed by other threads so that this would lead to different semantics.

The second ordering parameter passed to compare_exchange describes the ordering when the operation fails (see https://doc.rust-lang.org/stable/core/sync/atomic/struct.AtomicBool.html#method.compare_exchange). We don't need any synchrnonization in this case because we do nothing critical with the data when we fail to acquire the lock.

(This is how I understand it. I'm happy to discuss things further if you like!)

gendx commented 4 years ago

Thanks for the explanation!

I think it would make sense to document these explanations in the code, so that it's easier for anyone to check them.

Also, given that the code is simple enough, would it make sense to stress-test it with https://github.com/tokio-rs/loom to detect potential flaws?

phil-opp commented 4 years ago

I think it would make sense to document these explanations in the code, so that it's easier for anyone to check them.

Will do!

Also, given that the code is simple enough, would it make sense to stress-test it with https://github.com/tokio-rs/loom to detect potential flaws?

Thanks for the suggestion! I'm trying to set it up right now, but it is a bit difficult because the AtomicBool::new wrapper of loom is not const.

phil-opp commented 4 years ago

I added a simple loom test in the loom_test branch: https://github.com/rust-osdev/spinning_top/compare/loom_test

It works fine for THREADS = 2: https://github.com/rust-osdev/spinning_top/runs/455050962?check_suite_focus=true. However, for THREADS = 3 it either runs for a very long time or there is a bug somewhere. I started the test multiple minutes ago on my laptop and there is still no result…

phil-opp commented 4 years ago

I added some comments on the compare_exchange and the relaxed orderings in 8d31d656011e0f40acda80d484489c014f95fd48.

gendx commented 4 years ago

I added a simple loom test in the loom_test branch: https://github.com/rust-osdev/spinning_top/compare/loom_test

Sounds amazing! I don't have prior experience with loom, just heard about it but it's really nice that spinning_top can benefit from it :)

It works fine for THREADS = 2: https://github.com/rust-osdev/spinning_top/runs/455050962?check_suite_focus=true. However, for THREADS = 3 it either runs for a very long time or there is a bug somewhere. I started the test multiple minutes ago on my laptop and there is still no result…

I wonder whether it's due to the inherent complexity of concurrent operations (loom mentions that there can be combinatorial explosion), or if it's a genuine flaw where Relaxed could in principle lead to infinite spinning as I mentioned above. What happens if you put SeqCst for all orderings?

phil-opp commented 4 years ago

Sounds amazing! I don't have prior experience with loom, just heard about it but it's really nice that spinning_top can benefit from it :)

It's the first time I heard of it too, but it looks like a promising project. I don't think that we should merge the loom_test branch into master anyway because it makes the code much more complicated.

I wonder whether it's due to the inherent complexity of concurrent operations (loom mentions that there can be combinatorial explosion), or if it's a genuine flaw where Relaxed could in principle lead to infinite spinning as I mentioned above. What happens if you put SeqCst for all orderings?

No change. Still works fine with 1 or 2 threads, but hangs with 3 threads. Interestingly, more than 3 threads (4, 5, 10, …) lead to an immediate "SIGILL: illegal instruction". So I think either the combinatorial explosion causes this or there is some bug in my code or loom itself.

gendx commented 4 years ago

While digging deeper I found https://github.com/tokio-rs/loom/issues/53 which might be the cause of the issue with 3 threads. It seems that loom is still quite experimental indeed, which may cause the issue for more threads.

All in all I agree that keeping the loom_test branch separate makes sense for now, but at least keeping an eye on it can help bring confidence in spinning_top's implementation.

phil-opp commented 4 years ago

This issue indeed seems to be the cause of the problem.

All in all I agree that keeping the loom_test branch separate makes sense for now, but at least keeping an eye on it can help bring confidence in spinning_top's implementation.

Agreed! I plan to keep the branch around and update it from time to time.

ghost commented 4 years ago

The memory orderings all look sufficient and optimal. If RawMutex is used correctly to create critical sections, the load-acquire in a successful try_lock couples with the store-release in unlock and should ensure all operations within one critical section are visible to the next critical section. The load-relaxed optimization in the lock loop looks correct since it doesn't need to provide any ordering guarantees, and the load-relaxed in the failure case of try_lock should be correct since it doesn't provide synchronization in the failure case.

phil-opp commented 4 years ago

@akiekintveld Thanks a lot for the review!

gendx commented 4 years ago

Regarding crowdsourcing of code reviews, I recently learned about the cargo-crev project, which looks like a good way for anyone to add their review on a Rust crate (see the getting started), and for anyone to look at other people's reviews (see the dashboard). A simple but fundamental crate like spinning_top looks like a good candidate to review there :)

phil-opp commented 4 years ago

Thanks for the feedback everyone!