nim-lang / RFCs

A repository for your Nim proposals.
136 stars 23 forks source link

Stay with "Efficient", Not "Secure" Defaults #505

Open c-blake opened 1 year ago

c-blake commented 1 year ago

Abstract

This is RFC is about one idea raised in a verbose, unfocused forum thread. For Nim it is probably only about std/[random, hashes]. I believe this is already the attitude of Nim core. So, while folks can vote/comment here, this RFC requires no actual action. (Of course, the vote is biased since current users probably like the status quo.)

Motivation

First, this entire question only applies when there is a trade-off. Both is always great. :-)

The (not proposed) alternative, "secure defaults" is usually motivated by saying programmers A) do not read documentation and so, B) "often" move "working" code from non-adversarial domains to adversarial ones (e.g., add access to a public network, running against less trusted data, etc.). With secure defaults they incur less extra risk (from that one feature anyway...).

The proposed alternative, to my understanding the current status quo, is usually motivated by programmers A) not reading docs, and so B) not knowing some/all of the many optimization options available to them.

Description

Portably Secure = Slow

Since programmers rarely --passC:-march=XYZ (or the equivalent) to use instructions beyond the original AMD 2003-2005 x86_64 set (or original ARM ISAs or so on), Nim certainly does not default to more modern iterations of ISAs. So, even with the best HW support in the world for AES or whatever crypto, "secure default = slow default" for deployment unassuming of HW support (by intention or accident). This may, theoretically, change in gcc/clang, etc. of course. The performance gap may also get worse with time with ever improving CPUs.

There is also the Javascript backend. The JIT setting of JS gives it an advantage in adapting to the actually used CPU, but on the other hand some native Nim impl may be far too complex to be recognized by any JIT engines. Some optimized JS lib may exist/be possible, but that is not a dependency we have (much like we do not have CPU arch branches either...). So, I could be wrong, but I suspect JS would be "even slower" by default to be "secure".

Some Rationale For "Fast"

The stated motivating problem is for one class of code/language user out of many. People writing code that is scientific, local/private, randomized algos, etc., may never care. Yes, people can write code not knowing who will call it and at any time you can put up some network interface. So very little can be done if "client code just assumes" rather than reading docs. It's a very hair-splitting domain about which/where/whether the lack of attention causes trouble. { Of course, defaults are usually hair-splitting domains.. :-) } To me it seems like more EMPHASIS in docs may be all that is required (and possibly unified APIs making switching between fast and slow, secure defaults very easy, but I consider that an orthogonal topic for other RFCs).

The future is, of course, always uncertain, but the past is a partial guide. I think Nim has so far had more "marketing/adoption" woe from early, mistaken performance conclusions than "someone incautiously opened a security hole". In OSes, I first heard "secure defaults" 20 years ago as a marketing blurb Re: OpenBSD. While still alive, FreeBSD, Linux, and even Windows have thrived with a focus instead on efficiency or features like portability/HW support. While an OS and stdlib are different, this may be a lesson.

Checking what kind of hash|PRNG is used may also be one of the easiest things to audit in code. Code in actual adversarial environments probably needs much more. Code auditors can likely be more relied upon to actually read docs. So, simply promoting auditing people/tooling seems a more complete approach anyway.

There may be many more points. Feel free to chime in.

Backwards Compatibility

Probably not the biggest concern here since random number sequences and hash sets may also not require much release-to-release stability (but for unit tests and such). FWIW, random and hashes have been a fast, weak PRNG since the earliest Nim's of 2008. But both implementations have also changed at least once before.

arnetheduck commented 1 year ago

Much of this confusion (in discussion, and code) stems from poor naming and loose definitions - ie calling a secure and insecure hash the same thing even if they're not - ie the properties required for a "hash-table-hash-function" are not the same as when collision resistance is required (such as for long-term storage) - it's the same as calling a hash table and a sorted tree a "Dictionary" - both perform key/value lookups but they have significantly different properties.

"Secure" thus becomes something of a fuzzy feel-good name to slap onto things - one would need to give it a concrete and precise meaning to start discussing it - for example, when rust talks about "safe" code, it first defines what "safe" means - without that definition, one cannot talk about "secure" either.

To give an example, it is customary for hash table implementations in security-minded general-purpose languages to include a salt to prevent the trivial remote exploitation of the side effects of a non-cryptographic hash function - is this a "efficient default"? No, it adds a tiny bit of overhead but it nonetheless tends to be a good idea protects against a class of remote exploits (where knowledge about the hash function is used to stuff the table causing lookups to go into O(N) territory) and caches are a major use case for hash tables.

Did this appear in all early hash table implementations? No, the technique was unknown and the problem didn't exist (because there were no "remotes" in a local-only world). Here, we see the other problem with calling something "secure": it changes over time as new exploits become common and new techniques to defend against them are researched - sometimes a cheap one is found (like the salt) and it would be nuts not to include it in the std lib just because it didn't apply 30 years ago - the landscape has changed, and so has the utility function for a "general-purpose" implementation in a "standard library" - if one wants to go to the extremes in any direction (extreme efficiency or extreme security), the general-purpose std lib is likely not the right place because this one has to accommodate a wide range of use cases.

The last point about discoveries is also significant - the tradeoffs that held true 30 years ago don't necessarily apply today, so what the defaults should be can also be reevaluated regularly - it wouldn't be too far off to say that new languages pop up in response to poor "secure defaults" in the older generation because at some point, the balance of what is a "reasonable tradeoff" in efficiency vs security changed and what made sense when designing something then doesn't necessarily make sense today.

Araq commented 1 year ago

In my opinion hash tables are all about average speed though, there is the problem with hash collisions, hash chain lengths and somewhat disruptive O(N) insertions (which can be problematic for hard realtime systems) that need to happen on table resizes. Adding a salt improves some aspect of "security" for hash tables but hash tables are not about security. Security with hashing is a never ending whack-a-mole game. Now for crypto that is just the nature of the problem, but for a general purpose table/dictionary data type this problem is entirely self imposed.

In other words, the better solution is "just use a BTree". The standard library could encourage that. For example, there could be a switch -d:safeTables and then the implementation of std.Table changes to BTree consistently. I'm against more switches in general, but in this case it seems like a reasonable solution. You decide on the app level if you want to trade performance (and you can measure how much) for security or vice versa, and we already have this split in the form of -d:danger vs -d:release.

mratsim commented 1 year ago

For me, this is first and foremost an education issue.

I don't see how optimizing for people not reading documentation for their security sensitive code makes sense. We do that and then they reuse a nonce when calling AES thinking yeah secure RNG + secure encryption and all hell breaks lose:

In fact as mentioned in the thread, security focused libraries shouldn't have a toolbox approach because there is no transitivity in security, combining 2 secure tools does not imply the aggregate is secure, that's a rare property called Universally Composable. They should expose only sound protocols.

The "bleeding" edge

That said, there is one data structure that should be easy to harden, the hash table. In a network environment, topics and peers are often stored in a hash table and having a poor hash function can be exploited and lead to denial-of-service attacks by manipulating IDs to stuff everything in the same bucket.

Solutions

  1. Rename std/random to std/pseudorandom
  2. Mention std/sysrand in the documentation of std/pseudorandom (https://nim-lang.org/docs/sysrand.html)
  3. Unsure re hash functions of Tables, maybe the internal implementation can be re-engineered to import either the current murmurhash or something based on SipHash though SipHash by default might be best here.
Araq commented 1 year ago

Well you can argue that adding a salt is little effort for the gain and that my solution is much more work and has little flaws like the fact that a BTree requires a "less than" property and hash tables don't. ;-) I want my BTrees though.

But I'm also not happy with the changes to tables.nim that have been done in the past and might at some point use my own fork which undoes the hashing for integer values.

c-blake commented 1 year ago

Great commentary here, all around. Personally, I think the random stuff is mostly a documentation issue and renaming mostly needlessly disrupts. (EDIT: I just edited the title to scare quote both terms since interpretational risk is indisputable).

For remote DoS on tables, I think you can bound DoS success to the ratio (secure slowness + transform time)/(default fast time) which is really not so damaging. The key insight is just that "transform" has the same occasional O(N) (or O(NlgN), depending) whole set re-org category of resizing that @Araq mentioned as poison for hard real-time. (Hard RT may want elsewise.)

What I did in adix/lptabz in this vein was have things instrumented to detect DoS/bad hashes based on excessive search depth on under-full tables. The lptabz resize/rehash protocol is also based upon search depth not "load factor" which is natural in Robin Hood re-org. { adix/lptabz could use some "interface work".. }

In combination, the way you get a bound is by adapting to the attack. Re-salting with a true physical noise-based RNG is one possible mitigation (though, to be debug-friendly/portable, I only do that if -d:unstableHash is given). It would not be much work to generalize that to a wholly new per-table hash function like SipHash. The final mitigation could even be B-Trees if < is detected at compile-time. Adding this to stdlib might exceed this RFC's scope, though, by being a "smart default" -- fast at the beginning, but dynamically switching on demand. Oh, in fact that is how I framed it 3 years ago.. LOL.

"Pay for what you need when you need it (rather than always)" is usually popular. Nothing in principle blocks a client code programmed mitigation sequence to control costs. But I have seen no other impl like this in The Wild (though I haven't searched that hard). It could probably use a Red Team analysis...

arnetheduck commented 1 year ago

I don't see how optimizing for people not reading documentation

Optimizing for people not reading the documentation a lot of sense and I'm somewhat surprised that it's even a discussion point :1st_place_medal: Of course you should make your code intuitively usable such that you don't need to check the docs for obvious gotchas and ways that might screw you over - we're talking about the language and the standard library here, not the inner loop of highly optimized and specialized component: it shouldn't contain nasty surprises because people using it are not security experts and do not have the brain budget to deal with poor library code: they're busy solving their own problems - they might not even be able to comprehend the security notes pointed out to them in docs, no matter how well intentioned.

If nothing else convinces, it makes mathematical sense to optimize for non-readers: if the utility function of a library is how many users can use it successfully, it's better if you include both readers and non-readers of docs in your "target user".

use my own fork

Indeed! Every time you have a special case, the right answer is to use a specialist library - not change the standard library to work with your special case at the expense of general usage - eventually, if your "special case" turns out to be of general interest it will find its way to common use as well.

mratsim commented 1 year ago

Optimizing for people not reading the documentation a lot of sense and I'm somewhat surprised that it's even a discussion point 1st_place_medal Of course you should make your code intuitively usable such that you don't need to check the docs for obvious gotchas and ways that might screw you over

The full quote is

I don't see how optimizing for people not reading documentation for their security sensitive code makes sense.

When writing or using security sensitive code, you should read documentation, because there is something at stakes beyond just a program crashing or receiving bad values.

we're talking about the language and the standard library here, not the inner loop of highly optimized and specialized component

We are actually talking about hashes, RNGs and hash tables. The cryptographic ones are specialized components.

For a non-security sensitive domain, I agree, we should optimize for the principle of least surprise, though depending on our audience, say familiar with C or familiar with Python, that might goes two different ways.

Varriount commented 1 year ago

With regards to general standard library code, I think the default should be oriented towards security (e.g., external input shouldn't be able to crash a server implementation). With regards to things like randomness and hashes, I feel that the best way forward is to make it extremely clear when a function is/isn't secure, and even possibly point out insecure uses (e.g., direct string interpolation in an SQL query function).

mratsim commented 1 year ago

even possibly point out insecure uses (e.g., direct string interpolation in an SQL query function).

but without code examples or if you give some, in bold blinking red to not copy-paste them

arnetheduck commented 1 year ago

When writing or using security sensitive code, you should read documentation

This assumes you're aware that you're writing security sensitive code! ie many programmers are not, but when outcomes of their efforts are poor and systemic, it's a problem with the library/language, not their efforts - the fact that they're unlikely to read or understand the significance of the documentation is something to factor into the design of every component.

In fact, security-sensitive code in particular should assume that the user of it has no clue and does not read docs - a lot of research of crypto for example goes not into the maths but rather into how to create an API that cannot be used poorly (AEAD..).

Varriount commented 1 year ago

When writing or using security sensitive code, you should read documentation

This also assumes that the documentation is in a language you can read (or read well enough to grasp security implications).