Secure memory: Yay or nay

tniessen commented 4 years ago

OpenSSL has support for a concept that is referred to as secure memory or secure heap. Essentially, every time OpenSSL allocates memory, it indicates whether that memory should be allocated from the "normal" heap using malloc or from the "secure" heap. Allocations on the secure heap usually have the following security properties:

Allocated memory is never swapped to the disk and never included in core dumps, making it less likely to leak sensitive information through those.
Allocated memory is always overwritten on deallocation, also making it less likely to leak information.
It is far more difficult to use buffer overflows to retrieve data from these allocations. (OpenSSL causes the process to terminate if memory surrounding secure allocations is accessed.)

Node.js does not use this feature, meaning that OpenSSL performs normal allocations. (OpenSSL still overwrites the memory on deallocation.) We can enable OpenSSL's implementation, but it is quite restrictive and will be difficult to configure for users of Node.js. The alternative is to provide a more suitable implementation within Node.js, which provides similar security properties while being easier to use and less restrictive. However, that requires:

Upstream changes to OpenSSL's memory management, in order to allow overriding the built-in secure memory implementation. I have talked to some of the maintainers, and they would likely accept such changes.
The actual secure heap implementation in Node.js. That is not super difficult, but the implementation is platform-specific and will not be easy to test.

I started working on this approach about a year ago, and never finished it. I got varying feedback at the Montreal summit, so let's discuss this in public before I either stop working on it or put in a lot more work. As someone pointed out, many cloud applications likely don't care about this level of security.

Even if we upstream the necessary changes in OpenSSL, it is unlikely to ever work with a shared OpenSSL library, so dynamic linking would disable the feature. Also, I don't want to break Electron (as I have done numerous times, sorry @codebytere!), so we would need to make sure to either propose the change to both OpenSSL and BoringSSL or to make it easy to opt out of the feature at compile time.

I also have a patch that solves https://github.com/nodejs/node/issues/18896, but relies on this particular feature, and is super dangerous to use.

mscdex commented 4 years ago

What are the performance implications?

tniessen commented 4 years ago

This would only affect data that Node.js/OpenSSL deems sensitive (e.g., TLS private keys). For that data, allocation and deallocation are potentially slower, but accessing the allocated memory is at least as fast as accessing any other memory region.

Admittedly, that is only half the truth. If it does not negatively affect the performance, why would we not use this feature for all allocations? The problem is that secure memory cannot be swapped to the disk, and swapping is a crucial performance factor. While that makes accesses to these pages even faster than normal memory accesses, it can negatively affect the average system performance if a large number of pages is locked into memory, which is why the OS kernel usually restricts the amount of secure memory per process.

devsnek commented 4 years ago

it seems reasonable to me to explore using it in places like crypto and tls, and maybe also exposing something like Buffer.allocSecure. The downside there would be using too much allocSecure not being GC'd fast enough, so we could probably hold off on that.

bnoordhuis commented 4 years ago

Memory locked with mlock() is subject to the RLIMIT_MEMLOCK resource limit (ulimit -l), which at least on my system is restricted to 32 MB per process. Too much locked memory leads to out-of-memory errors, making it unsuitable for general purpose memory management.
mlock() works on page boundaries. Allocating 1 byte requires you to mmap 4k (or 64k on ppc) of memory. OpenSSL implements its own memory manager on top of mmap() to alleviate that but for storing sensitive data it's arguably better to use mmap/munmap directly (no chance of use-after-free or metadata corruption bugs.)
We likely want to set the MADV_DONTFORK flag when available, something openssl doesn't - but maybe it can be taught to do that, or maybe we can call madvise() on the return value from CRYPTO_secure_malloc().

tniessen commented 4 years ago

@bnoordhuis Thank you for sharing your thoughts!

That is a factor that we need to consider, and I have been thinking about ways to get around that. This is such a specific feature that users might want to configure that limit if they want to enable secure management. Another option might be to fall back to "normal" allocations once the limit is exceeded.
I am currently using a simple buddy allocator, similar to what OpenSSL has, except that my implementation supports dynamic heap sizes. mmaping individual allocations might provide some security properties that a buddy allocator doesn't, but it is likely slower, likely uses the limited amount of memory less efficiently, and makes the concept of guard pages almost impossible. (On the other hand, we don't have to use guard pages.)
My current plan is to allow replacing all the secure memory management functions in OpenSSL, similar to CRYPTO_set_mem_functions. That would make that possible :)

bnoordhuis commented 4 years ago

Another option might be to fall back to "normal" allocations once the limit is exceeded.

That seems like a bad idea. Users can always do something like this when node throws an exception:

let buf;
try {
  buf = Buffer.secureAlloc(32);
} catch (e) {
  buf = Buffer.alloc(32);
}

Or this when node returns null:

let buf = Buffer.secureAlloc(32) ?? Buffer.alloc(32);

That way it's explicit that you're okay with insecure storage.

("secure' and "insecure" are arguably misnomers. We need to come up with something better. :-))

tniessen commented 4 years ago

@bnoordhuis I agree, it seems to be a bad idea when done from within JavaScript. I was referring more to data that OpenSSL stores in secure memory, allocations that we cannot influence, and that might not be obvious to the user.

tniessen commented 4 years ago

("secure' and "insecure" are arguably misnomers. We need to come up with something better. :-))

My current patch calls it crypto.alloc :P

sam-github commented 4 years ago

So, its not clear to me how important a security issue this is, in realm of defence in depth, how deep does an attacker have to be before secure memory is an obstacle? I've no objection to it, though, it doesn't seem like it can hurt (well, if a js binding is exposed users could shoot themselves in the foot with it).

@nodejs/security-wg or anyone - is secure memory a common feature of comparable runtimes? chrome, electron, perl, python, ruby, go, rust, etc? That would give a sense of how many others thought it important in the past, but there isn't any reason js can't be more secure, even if we're the only one doing this.

deian commented 4 years ago

I think this is an important feature, but won't the GC and JIT mess with the intentions of trying to do this for JS?

bnoordhuis commented 4 years ago

@deian Anything in particular you're thinking of?

addaleax commented 4 years ago

@sam-github I certainly like the prospect of automatically excluding things like crypto keys from core dumps… having those inadvertently uploaded somewhere public seems like a plausible scenario to me.

tniessen commented 4 years ago

I think this is an important feature, but won't the GC and JIT mess with the intentions of trying to do this for JS?

No, and that is a problem... JavaScript is just a terrible language for secure memory management, we do not have a reasonable way to implement secure memory for JavaScript objects. This also brings us back to the point @sam-github mentions: How much security will this actually provide?

For example, let's consider private keys. I designed the new KeyObject API in a way that allows secure key storage, so if you use a KeyObject, the memory will be secure:

const key = crypto.createPrivateKey(fs.readFileSync('./private.pem', 'utf8'))

However, this creates temporary buffers and at least one JavaScript string value before even calling createPrivateKey, and even if they get gc'd, V8 won't necessarily overwrite that memory. It is possible to prevent those problems, but it requires users to explicitly embed memory security into their code:

// Read a buffer, not a string!
const keyBuffer = fs.readFileSync('./private.pem');
// Transfer it to secure memory.
const key = crypto.createPrivateKey(keyBuffer);
// Overwrite the "insecure" buffer.
keyBuffer.fill(0);

And if users store key and cert in separate variables, they might never be GC'd, depending on the rest of their application!

const key = fs.readFileSync('./private.pem');
const cert = fs.readFileSync('./cert.pem');
https.createServer({ key, cert }, app);

sam-github commented 4 years ago

I'm not sure on work vs value for this... that said, if there was a crypto.SecureBuffer(), or buffer.SecureBuffer(), it would be possible to use fs.read() to read from file into a secure memory backed buffer.

Also, if an encrypted private key is read into a string, it doesn't matter if the string is exposed, its cipher data, and it would get decrypted by openssl into the secure memory. Mind you, then a password is needed, and it would likely be in a normal js string. But, that'd be temporary, and likely get gced and cleaned. "Likely" isn't the firmest of grounds for security, but its better than "definitely available in the core dump", maybe?

I can't see secure memory making things worse, but I can see it taking some time. I think its really about how keen you are @tniessen! Also, I'm not sure if it really needs implementation on all systems. For example, Windows doesn't have core dumps, does it? So physical access to the swap space could still get sensitive memory data after the fact, but physical access gets most things.

deian commented 4 years ago

I think @sam-github nailed my concern: you really want to make sure that secret data never enters JS land. We also want to make sure that some library can't pollute globals to void the guarantees of SecureBuffers.

deian commented 4 years ago

To be clear: I think this is great and happy to look at preliminary designs/implementation.

jonathanmarvens commented 4 years ago

Any updates on this? I have a use case where secure memory, or at least memory with predictable behavior, is actually pretty important and the behavior of regular JavaScript strings isn’t quite predictable. I’d love to know if there’s been any progress on this!

Thank you!

tniessen commented 4 years ago

behavior of regular JavaScript strings isn’t quite predictable

This would in no way affect JavaScript strings, since those are managed by V8. It can only affect ArrayBuffer instances, including Buffers etc. JavaScript is not a good choice if you care about low-level memory safety because it simply doesn't exist.

tjconcept commented 4 years ago

I designed the new KeyObject API in a way that allows secure key storage, so if you use a KeyObject, the memory will be secure

Amazing!

A use case, I have, for secure memory is a "pepper" (a secret "salt") added to the input of a hash function. It's a random value generated on boot and I don't want it to ever leak. I guess it would require the .update function to take the same care as to not copy it into some unsafe memory in the process.

bmeck commented 3 years ago

Due to spectre etc. whatever we do should probably be IPC across an address space.

deian commented 3 years ago

Due to spectre etc. whatever we do should probably be IPC across an address space.

Why? Spectre works cross process. But, regardless, is the attacker assumption here arbitrary code execution? If so then I'm not sure the additional process will do much, especially if they're running with same uid.

bmeck commented 3 years ago

The attacker having arbitrary code execution is certainly a way to make any sort of Spectre mitigation out of bounds. However, Spectre allows arbitrary reads in address space isolation not arbitrary execution itself nor arbitrary cross process reads. If the secure storage is outside of the direct address space and instead only available via some level of IPC you can avoid this issue, this idea is the basis of newer technologies attempting to mitigate side-channels like CORB. Existing issues with attempts like this request exist such as SecureString which does attempt an in process confidentiality is discouraged. If you read the confidential information into the process space it has some level of threat. Spectre changed the threat have additional issues with things prior to de-allocation but not having direct language references being leaked.

deian commented 3 years ago

CORB makes sense in the browser context where the attacker is co-located. I guess I'm not understanding the attacker model you have in mind -- where is the attacker and what capabilities do they have? (And wouldn't the JIT leaking secrets be a bigger side channel to worry about?)

bmeck commented 3 years ago

@deian I do believe that 3rd party packages can be co-located attackers. Events such as updating 3rd party modules can pull in malicious code. Leaking the information would be possible if the 3rd party has some side-channel like attack such as Spectre. Limiting/preventing arbitrary unknown code being run is possible, but preventing a side-channel seems prohibitive and likely impossible without a full scale code audit. I agree there are other side-channels potentially, but things like the JIT having exploitation wouldn't discount this concern as this concern would exist even running without the JIT.

deian commented 3 years ago

I guess I'm skeptical of introducing complexity when we're not sure what it really buys us (or if Spectre is a realistic attack vector). (Some Spectre attacks attacks are also easier cross-process.) I think introducing a process boundary makes sense if we ensure that the code running on in the trusted, secret-handling process is actually constant-time. (Of course, doing that is also hard.)

Limiting/preventing arbitrary unknown code being run is possible

Sandboxing 3rd party JS modules in a meaningful way (i.e., dealing with confused deputy attacks) is hard. (This alone would be a huge win IMO.) But I guess if you have a robust sandbox or the code you're restricting is somewhat simple (and thus easier to both sandbox and prevent cross-boundary attacks) then sure worrying about Spectre seems more reasonable.

I'm not going to push against another layer of defense :)

tniessen commented 3 years ago

Basic implementation in https://github.com/nodejs/node/pull/36779.

RaisinTen commented 2 years ago

Is there anything else we need to do before closing this issue, now that https://github.com/nodejs/node/pull/36779 has landed?

tjconcept commented 2 years ago

Is there anything else we need to do before closing this issue, now that #36779 has landed?

I don't think it's the same feature. From OP:

We can enable OpenSSL's implementation, but it is quite restrictive and will be difficult to configure for users of Node.js. The alternative is to provide a more suitable implementation within Node.js, which provides similar security properties while being easier to use and less restrictive.

It seems like #36779 is more like "enable OpenSSL's implementation" than "provide a more suitable implementation within Node.js".

nodejs / node

Secure memory: Yay or nay #30956