Closed burdges closed 7 years ago
I'm not sure "time" has any say in this. Either you found a bug (in which case, we should investigate it as such) or your testing doesn't reveal whether the destructor was called but some other related property. To be clear, there is no mechanism in Rust which would intentionally "defer" running the destructor.
This isn't exactly an issue with drops being delayed:
ptr::write_volatile
. write_memory
is broken regardless of the fill valueptr::write_volatile
isn't necessary, self.0 = [0u8; 32];
exhibits the same issueI would hazard a guess that this does not affect mutexes at all. Rather, it's something wrong with the write in particular.
IIRC it's undefined behavior to rely on the stack (inline) contents of a value after it is dropped, which may be optimizing the unsafe { assert_eq!((*p).0,[0u8; 32]); }
to hoist in some cases (not sure why it distinguishes between zero and nonzero here).
For the context of https://github.com/isislovecruft/curve25519-dalek/issues/11, as long as the value is boxed you should be okay. It's not UB to rely on modifications to random non-inline (heap or stack reference) values that may be pointed to by the thing you're dropping. That's how Rc/Arc/Mutex work.
@ubsan @eddyb thoughts?
It works with a boxed value:
#![feature(core_intrinsics)]
#[derive(Debug)]
pub struct AnotherSecret(*mut [u8; 32]);
impl Drop for AnotherSecret {
#[inline(never)]
fn drop(&mut self) {
println!("dropping");
// unsafe { ::std::ptr::write_volatile::<AnotherSecret>(self, AnotherSecret([5u8; 32])); }
unsafe { *self.0 = [0u8; 32] };
unsafe { assert_eq!(*self.0,[0u8; 32]); }
}
}
fn main() {
let p : *const _;
{
let mut s = AnotherSecret(Box::into_raw(Box::new([3u8; 32])));
p = &s;
::std::mem::drop(s);
println!("dropped");
//unsafe { ::std::intrinsics::drop_in_place(&mut s); }
}
unsafe { assert_eq!(*(*p).0,[0u8; 32]); }
}
(you can remove the prints and it will still work)
Basically, don't expect writes to the inline stack part of a value during a destructor to stick around afterwards.
The volatile thing is definitely weird - but there's of course a much much better to box secrets. That's the fact that moving around the wrapper only moves a pointer, and at any time there is exactly one copy in memory of that secret (at least from a value of that type). That's quite worthwhile to box for, IMO.
I was wrong, there's no value sensitivity wrt zero and nonzero. I forgot to remove the assert in the destructor 😄
write_volatile
and regular self.0 = foo
stack writes are both poofed regardless of the value.
Looking at the IR for debug mode https://is.gd/NsBeic (Replaced the assert with a new function to remove noise from the code, ): https://gist.github.com/Manishearth/f911b210a93c4031b927e3c0485f8174
slice_loop_next
is where the "bug" comes from.
That code is the code called immediately after the array is initialized (the whole slice-loop stuff).
You'll notice that a pair each of AnotherSecret
and [32 x i8]
variables are declared on the stack here. Most are temporaries. This is what @eddyb was talking about, you can't guarantee that multiple copies of stack data won't exist. In release mode they should disappear, but the core issue here won't change.
I initially assumed that the read in the assert was being hoisted, because I thought that drop
was being inlined. That's not the case, and what's happening is actually much more straightforward: A copy of the stack variable s
is passed to drop
(which isn't inlined).
You can see that happening here, the IR creates a copy before passing it to drop
. Which it's allowed to do. Moving in Rust has no guarantees on what happens to the original stack space, which is what I was referring to earlier. Drop operates on the copy, while the original data stays around on the stack. With inlining, drop would not operate on the copy, however that stack space may be reused by other variables -- you see this in release mode with inline(never)
removed.
In conclusion, I don't see a Rust bug here; it's operating on the guarantees it assumes in unsafe code, on code that exhibits undefined behavior. Nor do I see any way to improve on this; the way moves work in Rust is pretty fundamental; stuff on the stack will be copied around even if you don't ask it to.
The correct solution for secret-wiping for crypto is to box it.
Interesting. If one wants this on the stack, then one wants a method to limit copies that goes beyond anything that rust currently provides. Thanks!
There are obviously situations where one needs cryptography but dynamic allocation does not yet exist, like say HSMs. All that sounds like that's an issue for another day though. Worst case, you could spin up an temporary allocator or something.
Did you use pub struct AnotherSecret(*mut [u8; 32]);
just for this example? Is there any security issue with say
#[derive(Debug)]
pub struct AnotherSecret(Box([u8; 32]));
impl Drop for AnotherSecret {
#[inline(never)]
fn drop(&mut self) {
*self.0 = [0u8; 32];
}
}
A reference would work too. But then it's harder to ensure that the original array you borrow from is cleared. But possible. You basically need to ensure that such an array is only used to provide clearing slices. You can write a macro that stack allocates a zeroed array, and provides you with a wrapped mutable slice to it. The wrapper has a clearing dtor. Done.
For temporary allocators it would use a reference, so that works too. Ideally you'd use an arena-like model where the arena itself is zero initialized, and you can borrow slices from it via a method, but this method returns wrapped slices that have a clearing dtor.
Yeah, I edited my original comment because I realized the issues with references. Ideally, one allocates cryptographic material in a heap protected by mlock
, including ensuring no other mlocked
pages intersect it.
Afaik, there is nothing besides mlock
wrong with pub struct AnotherSecret(Box([u8; 32]));
though, but I'm not 100% sure I understand these
impl<T: ?Sized> Deref for Box<T> {
type Target = T;
fn deref(&self) -> &T {
&**self
}
}
impl<T: ?Sized> DerefMut for Box<T> {
fn deref_mut(&mut self) -> &mut T {
&mut **self
}
}
I suppose this inner *
comes because Deref
for Unique<T>
returns a reference to a pointer, while the outer &*
must be a type conversion from the *mut T
to a T
and then returning its address. It's just a fancy way to return the address, but somehow it converts a *mut T
to a &T
or &mut T
without needing unsafe
.
This is because Box<T>
is special. *b
on a box does not call Deref::deref
(else these impls would be recursing infinitely, **self
is **(&Box<T>)
, which is *Box<T>
). It instead does an operation defined in the compiler. You want Deref
to be implemented so that folks can still use box with T: Deref
.
The reason box is special is mostly a holdover from the days when Box was a builtin type ~T
. The compiler had a separate type for it then and still does. I suspect that changing the struct definition of Box<T>
won't actually affect anything. As far as the user is concerned, Box is still special in two ways, one is that it can be used with the unstable box
(placement box / box pattern) syntax, and the second is that you can move out of the deref. let foo = *b
will work, moving the contents out of the box and deallocating it (but not calling destructors on its contents). No Deref
behavior on any other type works like this, because Deref
returns a reference. We probably eventually want to create a DerefMove
trait and remove this special casing, but for now it just works this way. Because of this, the deref operator on a box goes through the compiler.
There's no fancy type conversion involved, it's just funky internals. pub struct AnotherSecret(Box([u8; 32]));
will work fine as long as you impl a clearing drop on AnotherSecret
Ok thank you! I also worried that Box
is a wrapper on a Unique<T>
but maybe secrets should be a Shared<T>
or *mut T
for some reason. After some thought, I suppose a Shared<T>
is like a *const T
with Copy
, which sounds undesirable, while Unique<T>
is a *mut T
with Send
/Sync
, which sounds fine.
Ideally, one should probably update tars to the new allocator traits or something. I've post a quick and dirty crate to do this the cheap way discussed here however : https://github.com/burdges/zerodrop-rs
Shared<T>
is for usage in Rc<T>
-like types, Unique
is for box-like types. The differences lie in the implementation of Send/Sync. These types also provide ownership relations for dropck soundness, and for calculating variance.
You can just use Box here.
The allocators stuff is still unstable, you probably don't want to rely on it.
Btw, for zerodrop-rs, just use Box::new
instead of box
and you won't need #![feature(box_syntax)]
so it will work on stable.
Also, please don't call drop_in_place(self)
to run destructors on the inner type. Your drop implementation should be something like
/// Zero a `ZeroDrop<T>` when dropped.
impl<T> Drop for ZeroDropDrop<T> where T: Drop+Default {
#[inline(never)]
fn drop(&mut self) {
drop(mem::replace(&mut *self.0, Default::default()));
}
}
Yes, I've too much to do to worry about allocators anytime soon. ;)
I used box
because I did not want it copying the value to the stack, which Box::new
seemingly requires. It seems tricky to create a Box
without touching the stack actually. If T: Copy+Sized
then I can create it from uninitialized data and call clone_from
, but I'm surprised there is no Box::new_clone(&T)
method.
In fact, I'll need to replace Default::default
for the T: Copy
variants with unsafe zeroing because Rust still lacks [u8; n]
for n > 32 and users cannot add those impl
s themselves.
I'll look into mem::replace
and try to figure out if it hits the stack, thanks!
You can do Box::new(mem::uninitialized())
and then manually copy it over if you want to avoid the stack copy. The thing that box
gets you that can't be replicated is the ability to set the place in which the return value of the expression given to it is written avoiding an extra copy in case of a large struct.
drop_in_place(self)
is completely wrong, btw, since you end up dropping the type twice. Destructors will be called on the box and its containing type right after your Drop::drop
impl. drop_in_place(mem::replace(&mut self.0, Default::default()))
is better, but probably unnecessary. It might be better to define a trait Clear
with a clear(&mut self)
function that clears its contents , and bound your types on that, and use it.
Appears mem::replace
might place data onto the stack, so that's probably out as written. In fact, I should probably write
drop(&mut self) {
let s: &mut T = self.0.deref_mut();
unsafe {
::std::ptr::drop_in_place::<T>(s);
::std::ptr::write::<T>(s,Default::default());
}
}
so there is still a second inner drop, but only dropping the value returned by Default::default()
and no third drop created by say *s = Default::default()
. At present, all the Drop
types I've considered need special implementations anyways, so maybe ZeroDropDrop
makes no sense.
Yeah, drop_in_place used like that is fine.
I've found that
mem::drop
does not necessary run anyplace near where it gets called, which likely results inMutex
orRwLock
guards being held during expensive computations.As a simple example, I've made the following test for a zeroing drop of cryptographic material work by using
unsafe { ::std::intrinsics::drop_in_place(&mut s); }
instead of::std::mem::drop(s)
.This test fails if I use
::std::mem::drop(s)
or evenAt first, I imagined this was due to CPU pipelining, except this test still fails if I uncomment the sleep line or the 10k invocations of SHA3, or use an
atomic_fense
, so the compiler looks like the guilty party.There are two obvious scenarios where this seems problematic:
First, these zeroing drop calls should run eventually, but the longer the secret remains on the stack the greater the risk it gets compromised via side channel attacks, like being swapped to disk.
Second, we might run an expensive computation while holding a
Mutex
orRwLock
guard that unlocks when dropped. We might even run code needing additional locks, thereby risking deadlocks.In both case, one could fix the problem by calling
drop_in_place
, but I'd think that creates use after free risks, although maybe it works if we do something fancier likeIs there any safe way to ensure a drop happens before some expensive computation, system call, etc.? I suppose returning from
fn
usually does so, but mydrop_now
proves this sometimes fails.