buffer: discuss future direction of Buffer constructor API

addaleax commented 7 years ago

In today’s CTC meeting we discussed reverting the DeprecationWarning for calling Buffer without new that was introduced in v7 (PR up here), and it became clear that we need to come up with a long-term plan on what exactly we want to achieve, how to do that and improve our messaging about it, both before and after our actions. I’ll try to sum up what exactly we are talking about; obviously, I am somewhat biased, having been involved in plenty of the previous discussion here. (This has still gotten pretty long btw, so I hope a lot of people will find the information in here useful enough to warrant a Wall of Text.)

The Buffer constructor has the usability flaw that it accepts input with different type signatures, so new Buffer('abcdef') and new Buffer(100) will both return valid buffers, and in the latter case, the Buffer will contain 100 bytes of unitialized memory. This is a security problem for two reasons:

When passing unvalidated user input (e.g. from a JSON request) to the Buffer constructor where a string is expected but a number is actually passed, uninitialized memory will be returned:

// This is a dangerous example of converting a string to Base64!
new Buffer(someStringReceivedFromTheUser).toString('base64')

Passing the value 100 here will return a slice of memory that may contain garbage, but generally can contain any value previously stored in memory, including credentials, source code, and much more. @ChALkeR has a pretty good write-up of this: https://github.com/ChALkeR/notes/blob/master/Buffer-knows-everything.md

Accidentally accepting large numeric values can very quickly increase resource usage, and can be turned into a Denial-of-Service attack against vulnerable applications.

Again, @ChALkeR has a very-good write-up on these security issues at https://github.com/ChALkeR/notes/blob/master/Lets-fix-Buffer-API.md. It predates the current Buffer.alloc()/Buffer.from() situation, but it contains a helpful FAQ with answers to questions like “Why not just make Buffer(number) zero-fill everything by default?”.

So far, in Node v6.0.0 the safer Buffer.alloc()/Buffer.from() API was introduced and later backported to the v5.10.0 and v4.5.0 releases. Additionally, v6.0.0 came with a documentation-only deprecation of the old Buffer() API.

In June, https://github.com/nodejs/node/pull/7152 was opened, which seeks to deprecate the old Buffer() API using a runtime deprecation, i.e. printing a single warning per Node process when Buffer() or new Buffer() is executed for the first time. Currently, that PR is still open. A reduced version of it, https://github.com/nodejs/node/pull/8169, was landed as a semver-major change in v7.0.0, that emits and displays DeprecationWarnings for uses of Buffer() only, but excludes uses of new Buffer().

I had summarized the goals and possible actions before that decision was made in https://github.com/nodejs/node/pull/7152#issuecomment-241355246 ¹; And @jasnell has written a then-current long-term plan in https://github.com/nodejs/node/pull/7152#issuecomment-240753218 that would include runtime deprecations of new Buffer() in v8.0.0 and later actual breaking changes to the Buffer constructor.

The reason for this distinction was trying to keep the possibility of making Buffer a proper ES6 class at some point in the far, far future open, which would mean that calling new Buffer() may always work. (Effects of turning Buffer into a class would be proper subclassability and breaking Buffer() without new. It is, however, completely possible to add a separate class to the API that would behave like the current Buffer implementation does, only with these differences.)

As a result of that deprecation for Buffer() without new in v7.0.0, significant pushback from well-known members of the community ensued, both in the threads on https://github.com/nodejs/node/pull/7152 and https://github.com/nodejs/node/pull/8169. On the one hand, it became clear that we failed in our messaging to make clear that the primary motivation for that change was helping our users avoid serious security issues; on the other hand, the added deprecation warning seemed to be incongruent with the expectations of stability and backwards compatibility that module authors and consumers have, as far as Node core is concerned.

As a result of this, the CTC decided to consider reverting the deprecation warning, possibly temporarily, and the corresponding PR is in https://github.com/nodejs/node/pull/9529. The decision on that has yet to be made, but the desire has been expressed to reach a decision soon to limit the number of v7.x versions with possibly incongruent behaviour.

From following the discussions, it is obvious that the path forward is a contentious issue; right now, the opinions range from never introducing a runtime deprecation for any version of the Buffer constructor, to applying one for all uses of it at the next semver-major release in v8.0.0.

The strongest and most frequently expressed argument for fully runtime-deprecating the Buffer constructor soon remains that users may not be aware that parts of their application use an unsafe API and should be warned about that.

On the other side, the warning itself is perceived as a very disruptive change to the ecosystem, suggesting that it is definitely worth exploring alternative ways to reduce the usage of both Buffer() and new Buffer().

/cc @nodejs/collaborators

¹ It may or may not be obvious from the way I articulate my thoughts here – I try to stick to stating facts – but in hindsight, I regret writing it this way.

seishun commented 7 years ago

It's a well worn argument at this point that unexpected prints to stderr are a breaking change.

No one argues about that, but IMO the possible actual breakage caused by extra strerr output is a relatively minor pain point of the runtime deprecation, and not the one people are complaining about. People are mostly concerned about the amount of work that it would take to remove the warnings (which in most cases will cause annoyance, not breakage), which I do acknowledge.

If this is the only solution then the change is a breaking change. It should be treated as breaking all the modules that depend on the behavior as well as all the modules that depend on those modules and so on.

In most cases, simply ignoring the warning is also a solution. In most cases, the code will continue working as intended.

jasnell commented 7 years ago

@seishun ... do not forget that there are many users who run with --throw-deprecation. For those, a new deprecation warning is a new runtime error. It's currently not possible for us to determine just how many such users exist.

ChALkeR commented 7 years ago

For instance, modules that do assume zero-fill by default may want to proactively check that they are running on versions of Node.js that do zero-fill by default,

I don't think that would work and we would have no means to check that in runtime. A question: if we enable zero-filling by default, how would you estimate the percentage of library authors who will assume zero-filling but won't check the Node.js version?

Also note that such combination is not better and does not involve less code than just using the new API where it's available.

@mikeal, @jasnell re: linting and other means — note that the runtime deprecation in v7 significantly reduced the Buffer-without-new usage, but it was visibly increasing ever since the deprecation was reverted. The doc-deprecation and whatever else means are curretly in action don't work well atm. Note: I'm not saying that adding that rule to linters is a bad idea, it's a great idea.

Everyone, also please note the fact that zero-filling doesn't solve the quite popular Buffer(arg).toString('base64') issue (it will still be a DoS), while deprecation does.

jasnell commented 7 years ago

@ChALkeR ... yep. I tend to see it as ripping the bandaid off faster. However, we cannot discount the number of module developers with significant investment in the ecosystem who are telling us that runtime deprecation would be bad, so I'm trying to figure out a compromise path. If we go the route the ecosystem developers are asking us to take, then those ecosystem developers have to take on a certain amount of responsibility in solving the issue. If we take that action and we do not see measurable improvement, then going with the runtime deprecation is definitely a step we can take to help force the issue. You're absolutely right that we cannot reasonably estimate the risks incurred by switching to zero-fill by default.

Again, I do not see a good solution here. I see degrees of bad in each option. Basing the decision on a binary Will Work or Won't Work is not going to be very effective.

mikeal commented 7 years ago

@mikeal, @jasnell re: linting and other means — note that the runtime deprecation in v7 significantly reduced the Buffer-without-new usage, but it was visibly increasing ever since the deprecation was reverted. The doc-deprecation and whatever else means are curretly in action don't work well atm. Note: I'm not saying that adding that rule to linters is a bad idea, it's a great idea.

I think that working these kinds of deprecations into the tooling is probably going to be even more effective than runtime deprecation. Anecdotally, people who won't use any tooling for linting or security are likely to just run without dep warnings or ignore them. For people that use this tooling this will actually be a bigger error, one they have to alter their code to get past, than the runtime warning.

seishun commented 7 years ago

@mikeal would such tooling help to find usage of deprecated API in dependencies? Because as far as I can tell, fixing it in your own code is trivial. It's figuring out which dependencies need to be updated and accommodating any major changes in them that will require the most work.

ChALkeR commented 7 years ago

@jasnell It's not binary, I just simplified things a bit to that it would better fit into a comment.

Perform an estimation of the probability that a single random module developer would think about only new versions and will rely on zero-fill, multiply it by the chance that that developer won't care about version detection, then multiply that by the ecosystem size (well, the part that uses Buffer API, but that's still very huge). I expect to see lots of modules that start doing bad stuff on versions where zero-filling isn't enabled if we choose the zero-fill path.

Also, the unsolved DoS.

That's basically why I said that it «won't work».

Qard commented 7 years ago

Approaches like pushing for more linter warnings are nice, but the problem is not so much new code being released as modules depending on that one tiny module that hasn't been updated in two years and therefore has no one looking at the code.

The only real way I can think of to deal with it is to get a dump of every module in npm, grep (or parse the AST, if one is so inclined) for any matching uses of the Buffer constructor in all of them and blanket contact all the module authors.

feross commented 7 years ago

I just released standard 10.0.0-beta.0 which includes a check for deprecated Node.js APIs, including use of the Buffer() constructor. Details here: https://github.com/feross/standard/issues/693#issuecomment-283592259

The final 10.0.0 release will happen one month from today, on April 1, 2017 after our usual one month testing period. You can follow the release progress here: https://github.com/feross/standard/issues/808

Note: I pushed this release out earlier than normal. I am eager to try this approach of using community tooling to discourage deprecated APIs far in advance of Node actually hard-deprecating those APIs. Let's see how this goes.

@ChALkeR How are you monitoring usage of the deprecated APIs? Can you keep us updated on how usage in the wild changes over the next few months?

fabioberger commented 7 years ago

Hi everyone! @seishun invited me to chime in with a real world use-case for extending Buffer that I've recently encountered. The EthereumJS project (https://github.com/ethereumjs) uses Buffer to encode values sent and received over the network. The Ethereum RPC protocol however, expects a specific kind of HEX encoding, different from that used by Buffer (https://github.com/ethereum/wiki/wiki/JSON-RPC#hex-value-encoding). One way to fix this inconsistency would be to extend the Buffer class and overwrite the toString method to return the expected HEX value. This way the project which has come to rely heavily on Buffer would not need to be re-written using a replacement Buffer implementation.

Sample code:

module.exports = class EthBuffer extends Buffer {
    constructor(value, encoding = 'utf8', type = 'unformatted') {
        if (arguments.length === 2 && typeof value === 'string') {
                super(value, encoding)
        } else {
            super(value)
        }

        if (arguments.length === 2 && typeof value !== 'string') {
            type = encoding
        }
        this._type = type
    }
    toString() {
        const args = Array.prototype.slice.call(arguments)
        const val = super.toString.apply(this, args)
        if (args[0] === 'hex' && this._type === 'quantity' && val[0] === '0') {
            return val.substring(1)
        } else {
            return val
        }
    }
}

I hope this helps as you guys decide on how to proceed with the Buffer refactor! Good luck!

addaleax commented 7 years ago

@ChALkeR @seishun @nodejs/ctc How would you feel about a variant on zero-filling that doesn’t have the danger of people coming to rely on it? For example, we could pick a pseudorandom uint8 at process startup that we use for filling all buffers. (Or maybe the lower bits of uv_now(loop), if that’s better. You get the idea.)

I know this sounds like a really weird thing to do, but it:

Does not break expectations of existing code
Does not really provide behaviour that people would come to rely on that would harm people running outdated versions of Node
Adds a slight performance penalty to the old Buffer API (and therefore incentive to move away from it)
Solves the issue of accidental information leaks
Has performance similar to just zero-filling

The only thing this would not address is the DoS issue.

mscdex commented 7 years ago

"Slight" performance penalty? Have you measured this @addaleax ?

seishun commented 7 years ago

@addaleax That still has the same problem that I described in https://github.com/nodejs/node/issues/9531#issuecomment-273560353. Basically, the performance penalty might be significant for some, and they would have to spend time investigating where it's coming from.

addaleax commented 7 years ago

"Slight" performance penalty? Have you measured this @addaleax ?

Not explicitly. It would be weird if it benchmarked very differently from Buffer.alloc vs Buffer.allocUnsafe. So: Yeah, if you only look at the Buffer() calls itself, it would definitely be noticeable.

Basically, the performance penalty might be significant for some, and they would have to spend time investigating where it's coming from.

We are going to put whatever change we make into our release notes, so I wouldn’t feel too bad about that.

I’m bringing this up because it seems like zero-filling is actually something that a lot of CTC members would be on board with, and unless I’m missing something, this kind of filling would be a strictly better option than zero-filling.

seishun commented 7 years ago

We are going to put whatever change we make into our release notes, so I wouldn’t feel too bad about that.

Not everyone reads release notes though.

and unless I’m missing something, this kind of filling would be a strictly better option than zero-filling.

I suspect that on many platforms zero-filling is faster than garbage-filling.

feross commented 7 years ago

I like @addaleax's approach because it's the least disruptive one that's been proposed so far (other than "do nothing"), while still accomplishing the primary goals that we're all attempting to solve.

Like @jasnell said:

I do not see a good solution here. I see degrees of bad in each option. Basing the decision on a binary Will Work or Won't Work is not going to be very effective.

We should measure the true performance impact to get the full picture, but if the worst con of this approach is a performance hit on a deprecated API (that can be fixed by changing Buffer() to Buffer.from() in most cases) then this certainly sounds like the "least bad" option to me.

seishun commented 7 years ago

because it's the least disruptive one that's been proposed so far

I'm not exactly sure about the definition of "disruptive" used in this context, but assuming it means "things not working the intended way causing lowered productivity", then introducing a performance penalty is also disruptive, and I'm not convinced that the number of people who rely on high performance of new Buffer() is significantly lower than the number of people who rely on stderr output.

All else being equal, I'd say a disruption that's obvious is preferable to one that is hidden in release notes.

Note: I'm not dismissing non-stderr-related concerns regarding deprecation warning. I just disagree with using the term "disruption" to describe those.

mscdex commented 7 years ago

I agree with @seishun.

feross commented 7 years ago

The type of user who would be affected by a 10% (or whatever %) performance change in new Buffer() is probably not the same type of user who skips reading the release notes. Indeed, I would be surprised if there is any overlap between these two groups.

Anyone running a performance-critical app will not only read the release notes, but test their app and it's performance before deploying a new version of Node.js in production.

But, yes, I concede that even this approach isn't perfect. Still, I believe it's the least disruptive one that has been proposed so far.

Trott commented 7 years ago

FYI: https://github.com/eslint/eslint/issues/5614#issuecomment-285517494

jasnell commented 7 years ago

Regardless of what we do here, there will be disruption. That cannot be avoided. The question is about which kind of disruption is most acceptable (i.e. "least bad"). @addaleax's suggestion to fill with a randomly selected value is a very good option here. Sure, there may be some people who do not read the release notes, there will be some who do not read the documentation, there will be some who do not read the tweets, etc... there's not going to be much we can do for those people other than not make the situation worse. We will definitely need to measure the performance hit but Buffer.allocUnsafe() is the solution for that.

It does not solve all of the issues, of course, so there will be more to do.

ChALkeR commented 7 years ago

@addaleax Sorry, that won't work, it doesn't differ from zero-fill too much. When I said «authors will start relying on zero-fill» I mean not that they would start relying on the Buffers to be literally filled with zeroes, but that they would start to rely on Buffer(num) being secure (when num isn't too big). See details below.

There are (rougly) three types of problematic packages there:

1. Ones that mix `Buffer(num)` and `Buffer(string|array)` usage and accept user input there.

Example: ws. See https://github.com/websockets/ws/releases/tag/1.0.1 for details (though ws was not the only package there).

Two types of problems:

Uninitialized Buffer memory leak. This would be fixed by zero/random fill.
DoS, where the attacker could simply kill the server with a few packages. This would not be fixed by zero/random fill.

Those types of packages are relatively rare, comparing to other issues described below.

2. Ones that purposely call `Buffer(num)`, but fail to fill it in under some circuimstances.

See https://github.com/ChALkeR/notes/blob/master/Buffer-knows-everything.md#how-to-avoid-leaks for an example of vulnerable code.

Example: ip. Before https://github.com/indutny/node-ip/commit/b2b4469255a624619bda71e52fd1f05dc0dd621f, ip.mask('::1', '0.0.0.0') was returning uninitialized Buffer memory.

I can't tell exactly how common is that, but that's definitely less common than the example 3 below.

No large numbers could get into Buffer(number) in that package, so DoS isn't possible there. zerofill/randomfill would have fixed it, and that's the problem here.

The problem is that under zerofill/randomfill, that code is secure, and on all current Node.js versions, including 6.10.0 and 7.7.2, it wouldn't be secure. Once the people start relying on leaking Buffer(small_number) results as not being a security issue (and some will do that immediately), things will get worse for users who did not update to latest patch version.

Also, the performance hit could be noticeable in both zerofill and randomfill, that is yet another problem here: you can't rely on all users to read the changelog carefully. What I expect to happen once we enable zero/randomfill without a deprecation, is that some users would update to the latest patch version, see a performance degradation, don't bother digging that, instead assume that it's a temporary bug and rollback to an unpatched version. Which would get worse.

3. Ones that mix `Buffer(num)` and `Buffer(string|array)` usage in their API, but don't directly deal with user input.

Example: hoek. See https://github.com/hapijs/hoek/issues/177, fixed in hoek@4.0.0.

Module authors commonly say that it's not their problem, and that passing a number to their API is not supported (though they don't typecheck).

There is a large amount of those.

Those increase the scope of this problem and make this less tracable — if any user input ends up being passed to such an API in some other package or in some unpublished server code, it would become the same as in case 1 — at least a DoS and probably an unitialized memory leak (if there is a remote user, they usually have some means of observing the result).

zero/randomfill won't solve the DoS here (pretty much the same as in case 1), but it will make module authors even less willing to fix their packages, once they hear that the issue is somehow «fixed» in Node.js (though it won't actually be). If you think that they won't — note that even the discussion here often totally ignores the DoS.

They only way that will make things secure, fixing both uninitialized Buffer memory leaks and DoS issues in users code would be to migrate everything in ecosystem to Buffer.alloc/Buffer.from, and I don't see any way around that that would work long-term.

misterdjules commented 7 years ago

@jasnell

Regardless of what we do here, there will be disruption.

Why is disruption inevitable? It seems to me that with a documentation-only deprecation, and by better documenting why various Buffer APIs are deprecated, there could be no disruption.

Currently, users have to jump to the top of the Buffer documentation to read about why these APIs are deprecated, and there's no link from each deprecated function's documentation to that section with further explanations.

Adding why these APIs are deprecated in the same section where they are documented could help convey the message that these APIs have, among other problems, security issues.

feross commented 7 years ago

@ChALkeR DoS, where the attacker could simply kill the server with a few packages. This would not be fixed by zero/random fill.

We need to stop bringing up DoS. It's out-of-scope and completely unrelated to the uninitialized memory disclosure issue.

We can't solve security for users determined to hurt themselves. By your logic, Buffer.alloc(num) is also vulnerable to a DoS attack because if a user doesn't check the size of num a remote attacker could trigger a DoS. This is out-of-scope and not Node's job to worry about.

If we want to make progress here, we need to agree to focus on how to prevent the memory disclosure issue. That issue, not a DoS issue, is what originally prompted the new Buffer APIs and it's the only valid reason to migrate users from Buffer() to these new APIs.

@ChALkeR Once the people start relying on leaking Buffer(small_number) results as not being a security issue

No reasonable user will treat random data being returned as "not an issue". Unlike returning a zeroed buffer, returning a random buffer looks very, very weird. The only solution in this case is for the user to manually zero it out or use fill(0).

I think the scenarios you're describing are getting increasingly ridiculous and so impractical that I'm actually more confident now that this is the right solution after all.

addaleax commented 7 years ago

When I said «authors will start relying on zero-fill» I mean not that they would start relying on the Buffers to be literally filled with zeroes, but that they would start to rely on Buffer(num) being secure (when num isn't too big).

@ChALkeR I see your point but I don’t think that’s actually the case. For code authors, to rely on Buffer(num) being secure would imply that they are aware of the issues with the Buffer constructors, so those are the people we don’t need to worry about as much.

They only way that will make things secure, fixing both uninitialized Buffer memory leaks and DoS issues in users code would be to migrate everything in ecosystem to Buffer.alloc/Buffer.from, and I don't see any way around that that would work long-term.

Right, I am afraid that’s something we’ll have to accept is just not going to happen.

seishun commented 7 years ago

We can't solve security for users determined to hurt themselves. By your logic, Buffer.alloc(num) is also vulnerable to a DoS attack because if a user doesn't check the size of num a remote attacker could trigger a DoS. This is out-of-scope and not Node's job to worry about.

It seems you misunderstand the DoS issue. It happens when there is a chain of calls between the user and new Buffer(), where no one validates the input because they assume someone else will. As a result, new Buffer(num) can get called instead of new Buffer(string). This wouldn't happen with Buffer.alloc().

I think the scenarios you're describing are getting increasingly ridiculous and so impractical that I'm actually more confident now that this is the right solution after all.

We understand that you disagree with @ChALkeR, but such remarks are demeaning and antagonizing. Let's keep the conversation rational and stick to the facts.

jasnell commented 7 years ago

Let's definitely try to keep the conversation civil and constructive. At this point it seems like there's not a lot of convincing going on and just a lot of talking past one another. What we need to come together on is a path forward towards a solution. We're not going to solve the issue over night, nor are we going to solve in a single step.

There are three constituents to this problem that need to word in concert towards a solution: core, users and the tools ecosystem. There are thousands of modules out there that use Buffer() and new Buffer(). @addaleax is absolutely correct that we cannot simply make those modules change their code no matter what we choose to do here. Even if we simply stripped Buffer out entirely, that doesn't mean those module authors would change their code. @feross is also correct in that we cannot "solve" security for these developers. There is a certain amount of responsibility they have to take on themselves.

That said, @seishun and @ChALkeR are also correct in that these issues aren't simply going to go away and cannot simply be ignored. The tools ecosystem can and will help significantly here. The changes to standard, the linting rules, security vulnerability auditing, and education will go a long way but are certainly slow. We'll get there. @ChALkeR noted that there was a measurable decline in the uses of Buffer()-without-new when we had the deprecation warning in there. That's a positive. Noisy works but it can also be counterproductive.

What I want to do right now is come to a compromise solution that I know will not make everyone happy or solve all of the issues immediately but will hopefully move things in the right direction.

We modify Buffer(num) and new Buffer(num) to fill with a random byte value selected at startup for v8.0.0.
We introduce an optional deprecation warning that is off by default in v8.0.0 and can be enabled using a command line switch and environment variable. This warning would be emitted when the Buffer() or new Buffer() constructor is used.
We switch the deprecation warning to on-by-default in v9.0.0
We continue to work with tool creators and education providers to get the word out about using the new Buffer construction methods.
We continue to work with the ecosystem by proactively submitting pull requests to replace found uses of Buffer() and new Buffer() in the ecosystem.

Again, I know this plan does not solve the problem immediately, and I know it introduces annoying strerr output that is potentially breaking. I'm not trying to solve all the problems and I'm not trying to make everyone happy. I'm trying to find a solution that would be workable.

feross commented 7 years ago

@seishun It seems you misunderstand the DoS issue.

Thanks for explaining to me an issue that I discovered and reported. 🙄

It happens when there is a chain of calls between the user and new Buffer(), where no one validates the input because they assume someone else will. As a result, new Buffer(num) can get called instead of new Buffer(string). This wouldn't happen with Buffer.alloc().

Yes, the confusion between Buffer(num) and Buffer(string) is, in fact, the reason we created the new APIs, but it had nothing to do with DoS. You're inventing reasons ex post facto.

If you go back and read the original issue, you'll see that the rationale for the new APIs was uninitialized memory disclosure, a problem at least 100x worse than DoS.

If we can can prevent uninitialized memory disclosure, which @addaleax's proposal does, then we've solved the original issue that prompted the creation of these new Buffer APIs in the first place. This would let us keep Buffer() around for a bit longer, giving the ecosystem more time to migrate.

feross commented 7 years ago

@jasnell Looks like we commented at the same time. I could get behind your proposal, though I think it still deprecates Buffer too aggressively. v9 is ~8 months away.

With the memory disclosure issue solved by random byte filling, what is the urgency to deprecate? I think we could afford to wait longer to give the ecosystem more time to migrate.

jasnell commented 7 years ago

Then perhaps this: the CTC can review the progress of migrating the ecosystem away from Buffer() and new Buffer() before cutting the 9.0.0 release in October. The CTC could decide to enable the deprecation message then or not. By 10.0.0 next year, however, the deprecation warning would definitely be switched on by default.

yoshuawuyts commented 7 years ago

@jasnell having Node core API become a moving target is an excellent way to alienate the module authors that provide value to the platform. Don't deprecate primitives. Ever.

ChALkeR commented 7 years ago

@jasnell, that differs from what I proposed by zerofilling with a random number, which I am still unsure about for the reasons stated in https://github.com/nodejs/node/issues/9531#issuecomment-285625233. Also if it lands to 8.0, what would happen to LTS? Will it not get randomfill?

Note that @addaleax expressed an opinion (in the table, if I understood it correctly) that for deprecation at any point in near future (i.e. one year won't be enough) we will get too much pressure and will have to revert.

I personally don't agree with that, though.

@feross No, the fact that you personally ignore the DoS issue doesn't make it disappear — I expect a significant number of setups to be vulnerable to DoS because of the Buffer(num)/Buffer(arg) mixup in some package deep in the dependency chains. Yes, that mixup is the issue you reported, but the fact that you didn't mention DoS there doesn't mean that it is out of scope.

Btw, I am not sure how you can be against https://github.com/nodejs/node/issues/9531#issuecomment-283295518 / https://github.com/nodejs/node/issues/9531#issuecomment-283246696 but in favor of https://github.com/nodejs/node/issues/9531#issuecomment-285772835 which is mostly the same (except for the randomfill in 8.0). Or do I misunderstand something?

As for «increasingly ridiculous and so impractical» — I did acknowledge your expressive language here (no joking here, it's also an input source), but it would help more if you cited the exact statement (preferrably — the exact part) that you disagree with.

seishun commented 7 years ago

Yes, the confusion between Buffer(num) and Buffer(string) is, in fact, the reason we created the new APIs, but it was about it had nothing to do with DoS. You're inventing reasons ex post facto.

In addition to @ChALkeR's comment above, I'd like to point out that DoS was mentioned in https://github.com/ChALkeR/notes/blob/master/Lets-fix-Buffer-API.md, which was a major basis for the new APIs.

And it seems you're ignoring mine and @jasnell's request for a more civil attitude.

I think we could afford to wait longer to give the ecosystem more time to migrate.

I'd agree if there was evidence that the usage of new Buffer() in the ecosystem is going to drop on its own without runtime deprecation. So far I don't see it, linting isn't going to make people update their dependencies, and I really doubt many people would willingly enable an optional warning. I'd be happy to be proven wrong though, and I think it's something worth discussing.

isaacs commented 7 years ago

My proposal: don't deprecate new Buffer(n) or Buffer(n). Don't change it to an ES class. Turn on zero-filling by default.

Deprecation is a significant cost.

If the goal is to make the community safer, zero-filling satisfies that.

If the goal is to provide nudge the community towards faster APIs, zero-filling satisfies that for the ones who care, and the ones who don't care shouldn't be forced to care about it.

If the goal is to make Buffer more extensible, well, it's already plenty extensible, and I don't see how "Buffer can be subclassed" makes Node.js a better platform. In fact, it's probably a bad idea. The API should gently communicate that Buffer probably shouldn't be subclassed, and leaving it as a factory function accomplishes that.

Anyone who wants to subclass it, though, already can today with Node 7, and it's hacky enough that they're likely to avoid it.

If the goal is to warn folks off of a DoS from creating a very large buffer, it could print a warning when Buffer(n) is called with a suitably large number so that they can debug it when there are problems. It's a lot better than memory disclosure.

If the goal is to make people in the ecosystem stop using new Buffer(), well... why?

ChALkeR commented 7 years ago

@jasnell, thinking about it again, https://github.com/nodejs/node/issues/9531#issuecomment-285772835 looks good to me. If we have a concrete short path to deprecation, random-fill is acceptable. I'm still not sure should it be backported or not, though — your plan doesn't include backporting. Perhaps that's the best — that way, the impact of «users start relying on randomfill» concern is reduced to the possible minimum, and 8.0 will be safe from uninitialized memory leaks.

It's still not neglectible, though — even now, module authors often just don't see case 3 from https://github.com/nodejs/node/issues/9531#issuecomment-285625233 as an issue in the module.

trevnorris commented 7 years ago

Sorry about being absent from this discussion for the last month or so. Can someone clarify if all the "runtime deprecation" options I was shown on the "Node.js Buffer options" spreadsheet mean completely deprecation of the Buffer constructor?

I've voiced this several times in the past, but my opinion is that if any changes are to occur (outside of only zero-filling) then Buffer parameters should become the same as or a super-set of Uint8Array. The requiring of new is also important. Is this reflected in any of those options?

To reiterate, new Buffer(number) and similar should never be runtime deprecated.

If the goal is to make Buffer more extensible, well, it's already plenty extensible, and I don't see how "Buffer can be subclassed" makes Node.js a better platform. In fact, it's probably a bad idea. The API should gently communicate that Buffer probably shouldn't be subclassed, and leaving it as a factory function accomplishes that.

That's a strong opinion @isaacs, but I don't see any reasoning or support for it. In the past I've extended Buffer with additional functionality that made the API much easier to work with. The key is being able to call functions on the instance, while also being able to access its data via array index. This is difficult to replicate, and it would make things much easier if I could simply use class Foo extends Buffer.

addaleax commented 7 years ago

Can someone clarify if all the "runtime deprecation" options I was shown on the "Node.js Buffer options" spreadsheet mean completely deprecation of the Buffer constructor?

Yeah, it does.

ChALkeR commented 7 years ago

@trevnorris, note that the table has a «to the technically possible extent» sentence, which means that it shouldn't break code that doesn't use Buffer(arg) explicitly. #7152 and #11808 have a work-around for that, and everything works as far as I am informed. That was done by @seishun, I believe. Could you provide a testcase that would be broken by any of those PRs? It's better to move the dicussion to the PRs, though.

Trott commented 7 years ago

Upgrading this one from ctc-review to ctc-agenda. We need to make a decision about what is and isn't going to happen in version 8.0.0.

Trott commented 7 years ago

Summing up where things are now, at least as I see them, and reading a lot into the spreadsheet @ChALkeR set up and that all CTC members were invited to fill out:

While not unanimous, there seems to be a consensus that we should do something--that ignoring the issue is not a wise option.
Deprecating in version 8.0.0 has considerable opposition and scant support.
Zero-filling has more support than opposition at this time. That said, there are almost as many people who are neutral about it as there are people that are for it. So a lot will depend on how those folks end up voting.
Scheduling a deprecation has significant support and slightly less opposition. Again, a lot of neutral folks on that one, so how that goes will depend how the undecideds end up voting.
Opt-in deprecate is expected to easily land in version 8.0.0. It has two people on the record as being opposed. One of them is opposed to anything other than updating the docs. If I recall correctly, the other had trouble remembering why they indicated opposition and it may have been an error.
For some folks, support or opposition to zero-fill depends on whether there is a commitment to run-time deprecating (for example, announcing that run-time deprecation will happen in version N.0.0) and whether it will be backported to LTS lines. Therefore, it may make sense to try to come to a decision on whether or not to schedule a deprecation before trying to decide whether or not to zero-fill.
Random-fill seems to have less support than zero-fill but is still a viable contender. Most of the things said about zero-fill

Trott commented 7 years ago

I think this has been resolved, at least for Node.js 8.0.0. We will want to revisit this in 6 months (if not sooner!) before the Node.js 9.0.0 release.

For now, the CTC has decided that:

Node.js 8.0.0 will contain a flag allowing people to opt-in for a runtime deprecation message for Buffer constructor usage. (Will people actually use it? We'll find out.)
Node.js 8.0.0 will zero-fill buffers created with the Buffer constructor. This behavior, at least for the time being, will not be backported to earlier versions.

There were no decisions that were going to please everyone. This is where things sit for now. As mentioned above, we'll surely be re-visiting this in the not-too-distant future to assess how things are working or not working.

I should also mention that there is an effort to get a rule into ESLint that will flag Buffer constructor usage. I would characterize the state of that proposal as likely to be adopted by ESLint, but not a sure thing at this time.

I'm going to close this issue, but feel free to re-open or comment if you think that's not the right thing to do at this time. Thanks!

feross commented 7 years ago

FYI, I just released standard 10.0.0 which treats usage of deprecated Node.js APIs as a lint error. So we now have thousands of users (once they update) who will see warnings about Buffer() being deprecated in their tests and in their CI pipelines.

From the 10.0.0 changelog entry:

Disallow using deprecated Node.js APIs

Ensures that code always runs without warnings on the latest versions of Node.js

Ensures that safe Buffer methods (Buffer.from(), Buffer.alloc()) are used instead of Buffer()

It's hard to know exactly how many people use standard, but our shareable eslint config is downloaded 670K times per month, so I hope this change will have some noticeable effect in the usage numbers. We'll see.

I think it would be great if we could lean on community tooling, like standard and others, to help make these kinds of deprecations less painful in the future. @ChALkeR, it would be great if you could keep an eye on usage of Buffer() to see how much it changes over the next 1-3 months.

Trott commented 7 years ago

Also on the tooling front: ESLint will be shipping with a no-buffer-constructor (that may not be the name of the rule, I'm just using that as a shorthand right now) rule in the foreseeable future. See https://github.com/eslint/eslint/issues/5614#issuecomment-291742518 and thank @not-an-aardvark and @jasnell and everyone else who got us to this point. (I don't know if the rule will have enormous impact or modest impact, but we don't know if we don't try!)

seishun commented 7 years ago

With the Node.js 9.0.0 release looming closer, I think it's time to revisit this. @ChALkeR could you evaluate how much the usage of Buffer() has changed in the last 3 months?

ChALkeR commented 7 years ago

@seishun Thanks for the reminder! I hope to do that in a few days. =)

seishun commented 7 years ago

@ChALkeR pinging once again...

nodejs / node