Scrub keystream state from memory on garbage collection

kernelmethod commented 2 years ago

After ChaChaStream / CUDAChaChaStream gets garbage collected its state is still floating around somewhere in memory, which contains the key as well as any unconsumed keystream bytes that are still in the buffer (which may need to be used later on if the keystream is restored from a checkpoint). We should define a finalizer for ChaChaStream and CUDAChaChaStream that zeros out these keys before garbage collection.

Seelengrab commented 2 years ago

Note that finalizers in julia only run when the object in question is actually collected, as they're attached to the object and not the type (unlike C++, which runs them when the object goes out of scope, AFAIK). An explicit finalize! step for preemptive zeroing would be better imo (to not rely on GC to clear sensitive data), probably with an API like

ChaChaStream() do rng
    # use rng to generate data
end

to take care of that for the user by treating the RNG like a ressource for do notation.

kernelmethod commented 2 years ago

yeah, to be clear, I was thinking of creating a wrapper like Julia's Base.SecretBuffer, except that instead of acting like an IOBuffer it would implement the AbstractArray interface. I've already started working on it in the kernelmethod/secure_memory branch.

The way that SecretBuffer works is similar to what you were thinking -- it uses finalizers to wipe secret data from memory, but the preferred interface is to explicitly call Base.shred!. It'll emit a warning if you rely on the GC to wipe the data. That being said, it's better to implement a finalizer and have it available than to completely trust that the user will follow the interface correctly. That's how SecretArray (which is implemented in kernelmethod/secure_memory) is going to end up working.

I do like the idea of being able to treat the keystream as a resource with a finite lifetime by using the do keyword; we could probably add another constructor for AbstractChaChaStream that takes a function as its first argument to do this. That being said, we still need to have a way of constructing a keystream and destroying its state using something like

rng = ChaChaStream()

# do stuff

Base.shred!(rng)

since for certain programs the keystream is going to need to be able to live for a long time, which can make it inconvenient to rely on the do-based API.

Seelengrab commented 2 years ago

Yes, SecretBuffer was the reason I mentioned this :) I'm a big fan of always passing state around explicitly, and RNG is just hidden implicit state (luckily on more recent versions task local, sadly newly spawned tasks get seeded from the parent RNG - so something "innocous" like spawning a new task actually changes the stream of random numbers from rand()). As far as I'm aware, julia does not have any mechanism to guarantee a finite lifetime of mutable objects, since GC manages them anyway (sadly).

Your SecretArray looks like a neat idea, but I think you'll want to restrict the types it can take to <: Number, like Base.securezero! does - finalizing/shredding arrays of e.g. mutable objects requires shredding the mutable objects as well, which may lead to a lot of undefined behavior if something else still has a reference to that element. So I think shredding mutables in general requires more accurate lifetime tracking than we can get from userspace right now. You could also try to check Base.allocatedinline and then use Base.unsafe_securezero! with the pointer to the array, if you want to allow some more complex but still inline allocated objects for that array.

kernelmethod / ChaChaCiphers.jl

Scrub keystream state from memory on garbage collection #5