nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
104.75k stars 28.3k forks source link

Buffer(number) is unsafe #4660

Closed feross closed 8 years ago

feross commented 8 years ago

tl;dr

This issue proposes:

  1. Change new Buffer(number) to return safe, zeroed-out memory
  2. Create a new API for creating uninitialized Buffers, Buffer.alloc(number)

    Update: Jan 15, 2016

Upon further consideration, I think that returning zeroed out memory is a separate issue. The core issue is: unsafe buffer allocation should be in a different API.

I now support adding two APIs:

This solves the core problem that affected ws and bittorrent-dht which is Buffer(variable) getting tricked into taking a number argument.

Why is Buffer unsafe?

Today, the node.js Buffer constructor is overloaded to handle many different argument types like String, Array, Object, TypedArrayView (Uint8Array, etc.), ArrayBuffer, and also Number.

The API is optimized for convenience: you can throw any type at it, and it will try to do what you want.

Because the Buffer constructor is so powerful, you often see code like this:

// Convert UTF-8 strings to hex
function toHex (str) {
  return new Buffer(str).toString('hex')
}

_But what happens if toHex is called with a Number argument?_

Remote Memory Disclosure

If an attacker can make your program call the Buffer constructor with a Number argument, then they can make it allocate uninitialized memory from the node.js process. This could potentially disclose TLS private keys, user data, or database passwords.

When the Buffer constructor is passed a Number argument, it returns an UNINITIALIZED block of memory of the specified size. When you create a Buffer like this, you MUST overwrite the contents before returning it to the user.

Would this ever be a problem in real code?

Yes. It's surprisingly common to forget to check the type of your variables in a dynamically-typed language like JavaScript.

Usually the consequences of assuming the wrong type is that your program crashes with an uncaught exception. But the failure mode for forgetting to check the type of arguments to the Buffer constructor is more catastrophic.

Here's an example of a vulnerable service that takes a JSON payload and converts it to hex:

// Take a JSON payload {str: "some string"} and convert it to hex
var server = http.createServer(function (req, res) {
  var data = ''
  req.setEncoding('utf8')
  req.on('data', function (chunk) {
    data += chunk
  })
  req.on('end', function () {
    var body = JSON.parse(data)
    res.end(new Buffer(body.str).toString('hex'))
  })
})

server.listen(8080)

In this example, an http client just has to send:

{
  "str": 1000
}

and it will get back 1,000 bytes of uninitialized memory from the server.

This is a very serious bug. It's similar in severity to the the Heartbleed bug that allowed disclosure of OpenSSL process memory by remote attackers.

Which real-world packages were vulnerable?

bittorrent-dht

@mafintosh and I found this issue in one of our own packages, bittorrent-dht. The bug would allow anyone on the internet to send a series of messages to a user of bittorrent-dht and get them to reveal 20 bytes at a time of uninitialized memory from the node.js process.

Here's the commit that fixed it. We released a new fixed version, created a Node Security Project disclosure, and deprecated all vulnerable versions on npm so users will get a warning to upgrade to a newer version.

ws

That got us wondering if there were other vulnerable packages. Sure enough, within a short period of time, we found the same issue in ws, the most popular WebSocket implementation in node.js.

If certain APIs were called with Number parameters instead of String or Buffer as expected, then uninitialized server memory would be disclosed to the remote peer.

These were the vulnerable methods:

socket.send(number)
socket.ping(number)
socket.pong(number)

Here's a vulnerable socket server with some echo functionality:

server.on('connection', function (socket) {
  socket.on('message', function (message) {
    message = JSON.parse(message)
    if (message.type === 'echo') {
      socket.send(message.data) // send back the user's message
    }
  })
})

socket.send(number) called on the server, will disclose server memory.

Here's the release where the issue was fixed, with a more detailed explanation. Props to @3rd-Eden for the quick fix. Here's the Node Security Project disclosure.

What's the solution?

It's important that node.js offers a fast way to get memory otherwise performance-critical applications would needlessly get a lot slower.

But we need a better way to signal our intent as programmers. When we want uninitialized memory, we should request it explicitly.

Sensitive functionality should not be packed into a developer-friendly API that loosely accepts many different types. This type of API encourages the lazy practice of passing variables in without checking the type very carefully.

Buffer.alloc(number)

The functionality of creating buffers with uninitialized memory should be part of another API. We propose Buffer.alloc(number). This way, it's not part of an API that frequently gets user input of all sorts of different types passed into it.

var buf = Buffer.alloc(16) // careful, uninitialized memory!

// Immediately overwrite the uninitialized buffer with data from another buffer
for (var i = 0; i < buf.length; i++) {
  buf[i] = otherBuf[i]
}

How do we fix node.js core?

We sent a PR (merged as semver-major) which defends against one case:

var str = 16
new Buffer(str, 'utf8')

In this situation, it's implied that the programmer intended the first argument to be a string, since they passed an encoding as a second argument. Today, node.js will allocate uninitialized memory in the case of new Buffer(number, encoding), which is probably not what the programmer intended.

But this is only a partial solution, since if the programmer does new Buffer(variable) (without an encoding parameter) there's no way to know what they intended. If variable is sometimes a number, then uninitialized memory will sometimes be returned.

What's the real long-term fix?

We could deprecate and remove new Buffer(number) and use Buffer.alloc(number) when we need uninitialized memory. But that would break 1000s of packages. So that's a no-go.

Instead, we believe the best solution is to:

  1. Change new Buffer(number) to return safe, zeroed-out memory
  2. Create a new API for creating uninitialized Buffers. We propose: Buffer.alloc(number)

This way, existing code continues working and the impact on the npm ecosystem will be minimal. Over time, npm maintainers can migrate performance-critical code to use Buffer.alloc(number) instead of new Buffer(number).

Conclusion

We think there's a serious design issue with the Buffer API as it exists today. It promotes insecure software by putting high-risk functionality into a convenient API with friendly "developer ergonomics".

This wasn't merely a theoretical exercise because we found the issue in some of the most popular npm packages.

Eventually, we hope that node.js core can switch to this new, safer behavior. We believe the impact on the ecosystem would be minimal since it's not a breaking change. Well-maintained, popular packages would be updated to use Buffer.alloc quickly, while older, insecure packages would magically become safe from this attack vector.

feross commented 8 years ago

For those interested in getting the new behavior we propose without a change to node core, we published a user-land package: safe-buffer.

JacksonTian commented 8 years ago

Now, the new Buffer(number, encoding) will throw exception, detail is here: https://github.com/nodejs/node/commit/3b27dd5ce15942a054904b26e3dca295806038d8 . I introduce a new API Buffer.encode() for encoding case.

rvagg commented 8 years ago

You could make similar arguments about child_process being unsafe, about loading native addons being unsafe, even about fs being unsafe. The more bubble-wrap we put in core, the more users are likely to make the false assumption that programming in Node.js is like programming in the browser, which it absolutely is not. That's not a -1 to this proposal (in fact, this exact discussion is an ongoing on that keeps on coming up, we have an active thread in the security repo to thrash this out too), just a word of caution because we keep on having discussions like this where we want to move closer to a sandbox but that will never be the case for Node and it would be even more unsafe to lull users into the perception that they are as protected as in the browser (there's a very good chance that's exactly what happened to TrendMicro).

It should be noted that the primary reason for not zero-filling by default is performance. So new Buffer() should be considered a form of malloc(). In the past, it has been demonstrated that allocating filled memory has significant performance penalties to a plain malloc(). I keep on suggesting that those who are advocating switching the default to a clean allocation should do some benchmarking to demonstrate what kind of impact it might have. Otherwise, this argument is dead-in-the-water because the "it's faster" argument currently holds this one in place.

trevnorris commented 8 years ago

This discussion has been had several times in the last few weeks.

If a "high-risk functionality" application doesn't have proper type checking in place then it already has a security issue. This problem can't be dished into Buffer as if it's at fault.

Not zero filling the Buffer is completely a performance issue, but the impact on performance is more than trivial. Even at allocations as low as 1KB. And while this type of impact probably wouldn't harm your standard web app, it can noticeably affect performance of a node process.

The concern is when a number is passed when something else is expected. Again this is a bug in the developers code, and TBH it doesn't make much sense to impact the performance of the many modules out there today and force them to update their code to use new syntax because of that.

And let's be honest, in the years that Buffer has worked exactly this way how many times has this been reportedly seen as an issue in the wild. The only reason you're here now is because it affected modules you were directly involved with.

feross commented 8 years ago

@rvagg @trevnorris You both raise good points. Node is not the browser, and performance is critical.

The difference with modules like child_process and fs is that they're very obviously doing powerful and sensitive operations. Even the most trivial use of them will confront the developer with the fact that they're interacting with their OS on a low level and need to be careful.

Buffer is different. 95% of the time, it's safe and idiot-proof. 5% of the time you need to be very, very careful.

You can pass in:

The API seems like it's trying to lay a trap for the user.

Look at the contributors on each of those repos. Some of the most talented node.js developers looked at this code extensively and didn't notice this issue for years.

feross commented 8 years ago

And while this type of impact probably wouldn't harm your standard web app, it can noticeably affect performance of a node process.

To be clear, I'm not suggesting that node core should use zeroed-out buffers. It would continue using unallocated buffers, but that functionality would move to a different API away from the one that looks and behaves (and now actually is) a Uint8Array.

mscdex commented 8 years ago

-1 to changing new Buffer(number) to zero-out by default. Even if there was a separate API to preserve the behavior, that means a lot of code change for a lot of packages and application code. I'd personally be more open to a separate method that created zeroed-out Buffers, that way it's an explicit opt-in and you are (or should be) at that point aware of the tradeoff you're making.

ChALkeR commented 8 years ago

separate method that created zeroed-out Buffers

That would not fix anything, there is already a new Buffer(number).fill(0).

silverwind commented 8 years ago

I see around 20-25% perf regression using .fill() on small buffers (on ARM here):

node -e "for(var i = 1e6; i >= 0; i--) Buffer(1e3)"  6.16s user 0.19s system 100% cpu 6.287 total
node -e "for(var i = 1e6; i >= 0; i--) Buffer(1e3)"  6.10s user 0.19s system 101% cpu 6.222 total
node -e "for(var i = 1e6; i >= 0; i--) Buffer(1e3)"  5.97s user 0.17s system 101% cpu 6.074 total
node -e "for(var i = 1e6; i >= 0; i--) Buffer(1e3).fill()"  8.41s user 0.16s system 100% cpu 8.504 total
node -e "for(var i = 1e6; i >= 0; i--) Buffer(1e3).fill()"  7.89s user 0.14s system 100% cpu 7.968 total
node -e "for(var i = 1e6; i >= 0; i--) Buffer(1e3).fill()"  7.94s user 0.24s system 100% cpu 8.111 total
feross commented 8 years ago

I see around 20-25% perf regression using .fill() on small buffers

Nice.

Remember too, this is not a 20-25% perf regression across the board. It's only going to apply when all of these conditions are true:

Node.js core would not be slower. Uses of new Buffer(string), new Buffer(array), etc. would not be slower.

silverwind commented 8 years ago

I think I'm in support of this, but I'm not aware of all the use cases for Buffer(number), so take that with a grain of salt.

Instead of introducing new API, how about an opt-in?

Buffer(100); // safe
Buffer.unsafe = true;
Buffer(100); // unsafe
okdistribute commented 8 years ago

Thanks feross for explaining this well, I learned something tonight! I am relatively new to Node - I gained my open source chops in Python land. In Python, slow performance is sort of a given. If you want to make a Python program more performant, you can optimize it in certain ways, which are well known. Performance then is an advanced problem. The old adage "make it work, then make it fast" works well for these kinds of dynamic, interpreted languages. After all, we are not writing in C here, and the beauty of languages like Node and Python is the ease with which external, more performant packages can be accessed when necessary.

20-25% is actually quite a lot slower, true -- but how many calls to Buffer are using Buffer(Number) and require high performance? It seems as though this being fixed in core will bring great gratitude from newcomers and old hats alike. As a relatively new node programmer, I am certainly grateful - my memory won't be stolen!

finnp commented 8 years ago

I generally agree that the API is problematic.

I just want to point out, that fixing Buffer to be zero-filled potentially introduces security risks to code that would use the constructor to get some entropy. Consider this (arguably badly coded) example, where someone might use it instead of crypto.randomBytes.

var someEntropy = (new Buffer(256)).toString('hex')

Updating to a new node major version with the proposed new Buffer API would not break the code, but would introduce (an even higher) security risk to that code, without the user noticing right away.

I just wanted to point that out for the sake of argument.

mcollina commented 8 years ago

@finnp that API is so insecure that I would flag it as a security vulnerability, for the same assumption of this PR.

So, I'm :-1: on anything that make node (or apps that runs on node) slower. However, I see the security vulnerability.

My gut tells me: let's benchmark this change in node. Let's benchmark this change in userland (all the popular modules have benchmarks), and let's see where we stand. Does express gets slower? or Hapi? or socket.io? To what degree?

I think the problem is that malloc and encodings should not be on the same API. I propose to deprecate anything that is not new Buffer(number), and prepare a factory method for all the other cases. new Buffer(number) will stay like malloc, and we migrate all users to a safe API, that does not support the 'number' argument.

steve-gray commented 8 years ago

I think anyone depending on that kind of buffer entropy is already walking in the land of the undocumented, and the security risk represented by the issue versus the pretty modest performance ding make this one a no brainer. To put this back into a C context, returning memory without a memset in response to a 'client' request is effectively insanity.

joepie91 commented 8 years ago

+1 on making uninitialized Buffer creation a separate function in the API, and removing it from the "type-guessing" constructor, or replacing it with a 'safe' variant. This kind of potentially dangerous functionality should be explicitly opt-in.

I'm not sure why there's any discussion about it at all, to be honest. It's well understood that having safe defaults considerably decreases the chance of vulnerabilities, and moving uninitialized creation to a separate function would not incur any performance issues for maintained code (because they can simply use the new function if they know what they are doing, and actually intend uninitialized creation).

The unmaintained code would not receive security patches anyway, and in those cases a slowdown is better than an outright vulnerability. Node.js, as a project that follows semantic versioning, is in a fairly unique position to actually be able to make such patches without widespread ecosystem breakage.

@silverwind: Instead of introducing new API, how about an opt-in?

I would vote against a global opt-in. It doesn't really fix the problem (as a single intended opt-in means your entire codebase is now unprotected), while still introducing the same kind of break. That, plus global state gets messy in general.

@trevnorris: The concern is when a number is passed when something else is expected. Again this is a bug in the developers code, and TBH it doesn't make much sense to impact the performance of the many modules out there today and force them to update their code to use new syntax because of that.

It absolutely does. Safe defaults are the single most important thing a language/runtime can possibly provide for fostering a secure ecosystem around it. No matter how competent people are, they make mistakes, and the amount of possibly footguns should be reduced to an absolute minimum. The code change would be trivial, and provide significant security gains.

@trevnorris: And let's be honest, in the years that Buffer has worked exactly this way how many times has this been reportedly seen as an issue in the wild. The only reason you're here now is because it affected modules you were directly involved with.

Observance of security issues does not correlate to severity or how badly they need to be solved. Not to mention that you have no idea whether this has been discovered (and exploited) in the wild or not, nor any way to check, nor would the perpetrators have any reason to disclose as such.

The security community has spent several decades now understanding how to assess the severity of a security issue and how to prevent common issues, and it would probably be advisable to work from that knowledge, rather than repeating that entire process again in the Node.js community and exposing many users to danger in the process.

@mscdex: I'd personally be more open to a separate method that created zeroed-out Buffers, that way it's an explicit opt-in and you are (or should be) at that point aware of the tradeoff you're making.

Secure opt-ins don't work. See also this and the general principle of people following the path of least resistance.

ChALkeR commented 8 years ago

While I first thought that making Buffer(number) zero-filled by default is a good idea, I do not think that now. Imo it should better be deprecated and replaced by two separate methods.

Making Buffer(number) zero-filled will solve the issue long-term, but will cause havoc short-term. For example, if Alice writes a lib, sees that with Node.js 6.x, 7.x etc Buffer(number) is zero-filled, and does not zero-fill the buffer in her code (she doesn't want to zero-fill it twice!), and Bob uses that lib under Node.js 5.x where Buffer is not zero-filled by default — it will cause more damage that it solves.

Documenting that in a way «zero-fills since 6.x, zero-fill it twice if you use a lower version» would not help, Alice could think «I don't need 5.x» and publish a lib that does not manually zero-fill Buffers, but some other person could use that lib under 5.x and will be affected. There is no way around that — if Buffer(number) becomes zero-filled by default, the user will have to choose between zero-filling it two times in newer versions of Node.js or not zero-filling it at all in less recent versions of Node.js (and the third option would be — write some ugly version-detection code). Guess what will the user choose?

Alternate proposal:

  1. Introduce Buffer.allocateRaw(number) to allocate a non-zero filled Buffer, document it, try hard to note the possible security consequences of careless usage.
  2. Introduce Buffer.allocate(number) to allocate a zero-filled Buffer.
  3. Soft-deprecate Buffer(number) (first in doc for at least a whole major release, perhaps), tell people to use Buffer.allocate(number) instead.
  4. Introduce a command-line flag for opt-in to zero-filling all Buffer(number) calls, so that the topmost app could enforce zero-filling all Buffers. I think this point was already discussed and supported.
  5. Hard-deprecate Buffer(number), point people to Buffer.allocate(number). Perhaps make Buffer(number) zero-filled by default at the same time (but don't let people rely on that) and remove the command-line flag, because it being hard-deprecated would mean that the usage is low and no new code is expected to use that.

Thoughts?

This will also solve unintentional calls to Buffer(number) when a person wanted to call Buffer('200') instead (and called Buffer(200)).

ChALkeR commented 8 years ago

@finnp Anyone who uses the code you provided is already doomed.

vkurchatkin commented 8 years ago

At this point it's easier just to soft-deprecate Buffer altogether and encourage devs to use typed arrays instead.

mcollina commented 8 years ago

At this point it's easier just to soft-deprecate Buffer altogether and encourage devs to use typed arrays instead.

I think this is even harder than changing Buffer api, as Buffer is part of the Stream API, and also a lot of C++ code.

ChALkeR commented 8 years ago

@vkurchatkin Buffer(number) has much lower usage than general Buffer, plus the perfomance concerns of Buffer vs typed arrays. Let's be realistic here and let's try to solve this somehow. Deprecating Buffer, if that would ever happen, will take long time.

vkurchatkin commented 8 years ago

I think this is even harder than changing Buffer api, as Buffer is part of the Stream API, and also a lot of C++ code.

it shouldn't be that hard to change, since Buffers ARE typed arrays now.

@ChALkeR

plus the perfomance concerns of Buffer vs typed arrays

The only concern is instantiation. We can still provide a function to create uninitialized typed array or array buffer.

Let's be realistic here and let's try to solve this somehow. Deprecating Buffer, if that would ever happen, will take long time.

I mean not really deprecating them, but discouraging and maybe eventually removing from docs.

jasnell commented 8 years ago

I'm +1 on the general approach @ChALkeR suggests in https://github.com/nodejs/node/issues/4660#issuecomment-171262864

By introducing two new alternative factory functions (one safe, one unsafe) and deprecating the current unsafe constructor, we ensure that existing code isn't adversely affected while giving new code an appropriate path forward. Changing the expected behavior of the Buffer constructor or even deprecating Buffer altogether does not address all of the requirements here.

One tweak I would make to @ChALkeR's suggestion is: instead of allocate and allocateRaw, I would use allocateSafe and allocateUnsafe, to make sure it's clearly indicated that there is a safety/security choice being made.

Fishrock123 commented 8 years ago

Some comments, in no particular order:

At this point it's easier just to soft-deprecate Buffer altogether and encourage devs to use typed arrays instead.

@vkurchatkin Except that Buffer has extra APIs that make it more useful both to core and userland.

20-25% is actually quite a lot slower, true -- but how many calls to Buffer are using Buffer(Number) and require high performance? It seems as though this being fixed in core will bring great gratitude from newcomers and old hats alike. As a relatively new node programmer, I am certainly grateful - my memory won't be stolen!

@karissa Core's are probably most important, but if you are, say, running a heavy realtime websocket application, and ws were using zero-filled buffers for everything, you'd likely loose a lot of throughput.

In Python, slow performance is sort of a given. If you want to make a Python program more performant, you can optimize it in certain ways, which are well known. Performance then is an advanced problem. The old adage "make it work, then make it fast" works well for these kinds of dynamic, interpreted languages.

Python is a tricky comparison because the language just comes with everything. Sure, we expect JavaScript to not execute as fast as C, or even Java, but in Node, being able to achieve high throughput of I/O operations is pretty darn important. That's what we do in a nutshell after all.

Secure opt-ins don't work. See also this and the general principle of people following the path of least resistance.

I would generally agree with this.

Alternate proposal:

  1. Introduce Buffer.allocateRaw(number) to allocate a non-zero filled Buffer, document it, try hard to note the possible security consequences of careless usage.
  2. Introduce Buffer.allocate(number) to allocate a zero-filled Buffer.
  3. Soft-deprecate Buffer(number) (first in doc for at least a whole major release), tell people to use Buffer.allocate(number) instead.
  4. Introduce a command-line flag for opt-in to zero-filling all Buffer(number) calls, so that the topmost app could enforce zero-filling all Buffers. I think this point was already discussed and supported.
  5. Hard-deprecate Buffer(number), point people to Buffer.allocate(number). Perhaps make Buffer(number) zero-filled by default at the same time (but don't let people rely on that) and remove the command-line flag, because it being hard-deprecated would mean that the usage is low and no new code is expected to use that.

I think this the most sound proposal here, although I wouldn't mind tweaking it like @jasnell pointed out.

mscdex commented 8 years ago

Alternate proposal:

Introduce Buffer.allocateRaw(number) to allocate a non-zero filled Buffer, document it, try hard to note the possible security consequences of careless usage.

-1

Introduce Buffer.allocate(number) to allocate a zero-filled Buffer.

+1

Soft-deprecate Buffer(number) (first in doc for at least a whole major release, perhaps), tell people to use Buffer.allocate(number) instead.

-1

Introduce a command-line flag for opt-in to zero-filling all Buffer(number) calls, so that the topmost app could enforce zero-filling all Buffers. I think this point was already discussed and supported.

+2

Hard-deprecate Buffer(number), point people to Buffer.allocate(number). Perhaps make Buffer(number) zero-filled by default at the same time (but don't let people rely on that) and remove the command-line flag, because it being hard-deprecated would mean that the usage is low and no new code is expected to use that.

-1

mcollina commented 8 years ago

I agree with @jasnell. Buffer.allocateSafe() and Buffer.allocateUnsafe() are the best ones. I would shorten them up to Buffer.safe() and Buffer.unsafe(), but that's my taste.

Fishrock123 commented 8 years ago

@mscdex I actually disagree more with that, allocate() does not sound safe-by-default to me.

silverwind commented 8 years ago

Imho, new API would just complicate things unneccessary. +1 for a --safe-buffers flag.

ChALkeR commented 8 years ago

@silverwind Just an opt-in command-line flag will not fix anything in the ecosystem. It will fix things only for those setups that care, and those will receive unneeded performance penalties (by double-zeroing the Buffers). It should be viewed only as a temporary measure.

mscdex commented 8 years ago

@Fishrock123 I was +1 to that particular idea, not the specific function name. I do agree a better name could be chosen if that route was taken (allocZeroed()?).

silverwind commented 8 years ago

@ChALkeR we could just make it --unsafe-buffers if we really want to go that far, but I think a few people here would rather have zeroed buffers as an opt-in because performance is a key feature of Node.js after all.

ChALkeR commented 8 years ago

@silverwind

When trying to solve a problem, we should ask these questions:

  1. How should this been done from the start, putting all current situation aside?
  2. How do we get there in a way that solves things asap, but still breaks the ecosystem in the least possible way?

Preserving new Buffer(string), new Buffer(number), and adding just the weird command-line flag that affects the behaviour of the latter is not the way how this should have been done.

joepie91 commented 8 years ago

I'm a bit iffy about a long hard-deprecation period (as this is a critical security issue that will affect existing applications right now, likely without awareness of the operator), but in principle, I agree with @ChALkeR's suggestion to deprecate the current uninitialized constructor entirely, and to introduce two new methods.

That having been said, I think @jasnell's suggested tweak is a very good idea, and even an essential one, from an "encouraging safe implementations where possible" point of view. Rust takes the same approach with unsafe blocks.

It requires the developer to commit to the choice of using something 'unsafe', which will make them less likely to do so without understanding the consequences. As a bonus, it acts as a 'red flag' to others reviewing the code - a clear indication that they need to be careful when modifying it, as an 'unsafe' feature is being used.

I think the --safe-buffers flag as a temporary mitigation strategy is a good proposal as well, but it should absolutely not be the only mitigation against this issue in the long term. It comes with a severe performance penalty, and will thus introduce a "security vs. performance" trade-off, and I think we all know how that is going to turn out when managers get involved.

ChALkeR commented 8 years ago

@joepie91 A hard deprecation done sooner would leave us in a situation when module authors can't use the old API because it's hard deprecated and can't use the new API becase old Node.js versions do not support it.

I guess that will not object to hard-deprecating Buffer(number) sooner that one major release away, if all the other supported Node.js branches receive the two new methods (as backports). But I doubt that anyone else would support that.

joepie91 commented 8 years ago

I guess that will not object to hard-deprecating Buffer(number) sooner that one major release away, if all the other supported Node.js branches receive the two new methods (as backports). But I doubt that anyone else would support that.

Releasing a minor update for existing branches with the new methods would probably be a good idea anyway, for module compatibility reasons. Requiring developers to "detect" which API to use is a great way to ensure that they will just use the old (dangerous) API for as long as they can get away with it instead...

trevnorris commented 8 years ago

To be clear, I'm not suggesting that node core should use zeroed-out buffers.

This is mostly moot. The majority of memory for core purposes is allocated on the native side, bypassing the ArrayBufferAllocator. The reason for how it operates today was always done for the best interest of users, and our assumption that developers understood the clearly documented API they were using.

The API is optimized for convenience: you can throw any type at it, and it will try to do what you want.

This makes it sound like Buffer does a best guess. Every acceptable value is documented, and what will happen with that value is also documented.

The difference with modules like child_process and fs is that they're very obviously doing powerful and sensitive operations.

node is full of operations that allow us to do power OS interactive things with a simplistic API. This is one of the most powerful aspects of node. I remember when I first started using node (and was completely new to server side development in general) I did all sorts of stupid things to my machine. I learned, admitted my stupidity and didn't do it again. Truth be told I'm definitely not done doing stupid things.

okdistribute commented 8 years ago

Those who want performance will figure out how to get it.

Let's zero fill buffers by default. Please let's not create a mysql_real_escape_string situation. Just fix it.

On Wednesday, January 13, 2016, Sven Slootweg notifications@github.com wrote:

I guess that will not object to hard-deprecating Buffer(number) sooner that one major release away, if all the other supported Node.js branches receive the two new methods (as backports). But I doubt that anyone else would support that.

Releasing a minor update for existing branches with the new methods would probably be a good idea anyway, for module compatibility reasons. Requiring developers to "detect" which API to use is a great way to ensure that they will just use the old (dangerous) API for as long as they can get away with it instead...

— Reply to this email directly or view it on GitHub https://github.com/nodejs/node/issues/4660#issuecomment-171472775.

Karissa McKelvey http://karissa.github.io/

ChALkeR commented 8 years ago

@karissa Zero-filling new Buffer(number) by default will actually make the situation worse from the security point of view, taking the current ecosystem into account. See https://github.com/nodejs/node/issues/4660#issuecomment-171262864 for explanation and an alternate approach.

silverwind commented 8 years ago

(added a security label, memory is about memory leaks)

ChALkeR commented 8 years ago

Btw, I made that old note (that was unfinished before) public: https://github.com/ChALkeR/notes/blob/master/Buffer-knows-everything.md

feross commented 8 years ago

This makes it sound like Buffer does a best guess. Every acceptable value is documented, and what will happen with that value is also documented.

You're right: it's already documented. But the disclosure is a total of 39 words, amidst a 6000 word document. We can blame users for not reading the docs – but there's no denying it's bad API design to mix safe and unsafe functionality in the same constructor. It's laying a trap for the user.

node is full of operations that allow us to do power OS interactive things with a simplistic API. This is one of the most powerful aspects of node.

It's interesting that you bring up other core APIs.

There are many core APIs that helpfully coerce from String to Number, or the reverse. For example, setTimeout(fn, '1000') becomes setTimeout(fn, 1000).

Even buffer itself does coercion: buf[0] = '40' sets the index to the number 40, and buf.readInt8('0') reads out the value from index 0. Both automatically coerce from String to Number.

Can users really be blamed for expecting new Buffer(myVariable) to be safe?

feross commented 8 years ago

I also think the existence of Uint8Array and the other TypedArrayView types plays a role here.

Buffer's similarity to these safe typed arrays lulls users into a false sense of security. The fact that Buffer is now also a subclass of Uint8Array makes the potential for confusion even greater.

bricss commented 8 years ago

Every new Buffer should allocate new sandboxed memset, fulfilled with zeros depends on it size.

ChALkeR commented 8 years ago

@bricss Switching to zero-filled (by default) Buffer(number) will most likely make things even worse in the current situation, even from the security point of view. Please read my explanation above.

feross commented 8 years ago

@ChALkeR Btw, the point you raised is valid and I hadn't considered it. Thanks for bringing it to light. Any solution should take that into account.

ChALkeR commented 8 years ago

@feross What do you think about the proposal at https://github.com/nodejs/node/issues/4660#issuecomment-171262864?

jasnell commented 8 years ago

@nodejs/ctc: Distilling the conversation up to this point: the existing behavior of Buffer(number) is likely not going to change. Given that zero-filling by default would be a significant breaking behavioral change, it would take at least one full semver-major cycle to do and would make Buffer(number) behavior expectations different across different versions of Node, which will lead to more issues than it would solve. Given that, changing the current behavior of Buffer(number) is unlikely to happen. Scanning through the thread, I believe there is consensus on this point among the core contributor participants.

There also appears to be consensus among the core contributors that adding a new factory method for creating a "safe" zero-filled Buffer instance is a good thing. There is some disagreement over what to call it. I believe simply calling it Buffer.safe(number) is sufficient. The implementation of Buffer.safe(number) would essentially be return new Buffer(n).fill(0).

There appears to be a difference of opinion among the core contributors about whether the Buffer(number) constructor should be deprecated and replaced by a Buffer.unsafe(number) type of factory method. I can understand the reasoning for this. For my part, I'm in favor of introducing a new Buffer.unsafe(number) factory method to be symmetrical with Buffer.safe(number) -- which would actually make it easier to understand. But I get why folks don't want Buffer(number) deprecated and I can live with that.

There appears to be consensus among the core contributors about having a command line switch that would explicitly change the default behavior of Buffer(number) to zero-fill by default.

There also appears to be consensus that the documentation can do a better job of spelling out the risks.

The tl/dr; version is:

If we went with this now, we would still have the ability to add Buffer.unsafe(number) and deprecate Buffer(number) later.

I suggest we go with this approach.

joepie91 commented 8 years ago

Change or Deprecate Buffer(number): No

This is unacceptable. This API is extremely dangerous and likely already an issue in many applications due to the ease of introducing it (any typed user-supplied input is a problem!). It absolutely needs to be deprecated as soon as possible, at the very least a soft deprecation.

ChALkeR commented 8 years ago

@jasnell Without deprecation of Buffer(number), this has no sense. There is already an opt-in (Buffer(number).fill(0)) for those who care, but users should be guarded from inadvertently creating unitialized Buffers in the long term, so Buffer(number) must be deprecated or changed. Changing it will break more stuff, so the deprecation must happen.

jasnell commented 8 years ago

@ChALkeR ... ok. As I said, I prefer that option also. @trevnorris and @mscdex , if we had symmetrical Buffer.unsafe(number) and Buffer.safe(number) with a soft-deprecated Buffer(number), is that something you could live with?