nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
105.21k stars 28.5k forks source link

Tracking Issue for Runtime Deprecation of Buffer constructor #19079

Closed Trott closed 6 years ago

Trott commented 6 years ago

TSC approved runtime-deprecation of Buffer constructor for 10.0.0.

Here's the tracking list. @nodejs/tsc: Feel free to edit this to add more items to the checklist. I'm just getting it started, really.

What have I missed?

@nodejs/community-committee @nodejs/buffer

FWIW, here's how the TSC voted:

Trott commented 6 years ago

One thing we'll need is a concise statement as to why we're doing this instead of leaving things where they are (with new Buffer() now zero-filling and a documentation deprecation for it). I have a feeling we'll see a lot of comments like this: https://twitter.com/shelleypowers/status/969568430105006080. If we can't answer them quickly and concisely, it could be bad. I'll add an item to the checklist for this.

bnb commented 6 years ago

@Trott would you like CommComm's help in crafting the statement? I think we'd be generally happy to help on messaging around this / reviewing a blog post if y'all want to write one.

shelleyp commented 6 years ago

I think providing a good, concise statement is important because you're basically going back on an implicit promise outlined in https://medium.com/@jasnell/node-js-buffer-api-changes-3c21f1048f97, as well as the API docs for the last two years.

Trott commented 6 years ago

@Trott would you like CommComm's help in crafting the statement? I think we'd be generally happy to help on messaging around this / reviewing a blog post if y'all want to write one.

@bnb More than that, I'd like someone to lay out the technical reasons for doing this. I'd kind of like to collect bullet point reasons for doing this from @ChALkeR and @seishun on the one hand, and maybe from some of the used-to-be-no-never-ever-do-this-but-have-moved-a-bit-in-the-other-direction people (@addaleax and @evanlucas would be my choices there), and kind of try to come up with a statement from all of that. Then I can run it by @nodejs/community-committee and @nodejs/tsc.

Trott commented 6 years ago

you're basically going back on an implicit promise outlined in https://medium.com/@jasnell/node-js-buffer-api-changes-3c21f1048f97

@shelleyp Part of me wants to point out that that's not (as far as I know) an official communication from Node.js but the personal blog of Some Person Who Happens To Be On The Project Technical Steering Committee. You're not wrong to treat the statements in there with weight. At the same time, it's hard for the project to be bound by statements from its individuals. The members of the TSC disagree on things all the time, as do people in the larger project.

Again, though, that issue of communication is the project's problem to solve. You're not wrong. (Maybe we can ask people to use a disclaimer similar to the ones you see that indicate someone's opinion is their own and not their employer. Seems like something that would be hard to enforce and probably ineffective to me, but I don't know what else to suggest.)

shelleyp commented 6 years ago

It's less that James Snell was a person on the TSC, but was seen as a pretty definitive authority on Node at the time. And he spoke with a great deal of finality about the future of Buffer constructor.

But regardless, Buffer constructor was such a fundamental element of earlier Node releases. Technology support groups typically hesitate or do anything they can to avoid 'breaking' backward compatibility.

Having said that, security is one impetus for breaking backward compatibility.

Still, I can see some poor soul somewhere having created a large complex Node application, who played by the rules and didn't use Buffer constructor, and all of a sudden she's hit with this warning. And she'll look at the app's module dependencies and they're fine, but it's some slub of a module buried layers deep that is perfectly safe...but uses Buffer constructor.

All of a sudden, Node seems far less stable than it once was. Especially when she starts tweeting about this blankity-blank warning that popped up that she can't do anything about.

I'm just trying to give you an outsider perspective, having been through these types of events with other technologies.

So if you all decide to do this, OK. But warn folks ahead of time, strongly highlight this change for 10.0 for those migrating, and provide an explanation, and even provide steps a person can take--either to remove the warning or to reduce the dependency on the module that generated the warning.

Node has had some issues in the last few years (more organizational than technical). It needs to present rock solid stability to the world.

My 2 cents worth

jasnell commented 6 years ago

It's less that James Snell was a person on the TSC, but was seen as a pretty definitive authority on Node at the time. And he spoke with a great deal of finality about the future of Buffer constructor.

Not so much "was". Still here.

So if you all decide to do this, OK. But warn folks ahead of time,

We have been actively doing so.

...strongly highlight this change for 10.0 for those migrating, and provide an explanation, and even provide steps a person can take--either to remove the warning or to reduce the dependency on the module that generated the warning.

As the person who is handling the 10.0.0 release, and the person who will be writing up the notable changes for that release, I intend full well to clearly highlight the new runtime deprecation for new Buffer(), and communication via other channels such as Twitter have already been used to communicate the change.

Other members of the TSC have been actively opening pull requests on ecosystem modules to update code.

bnb commented 6 years ago

@shelleyp I can't commit to anything for the TSC, but I have little doubt that the project will do everything in its power to communicate this as effectively as possible and push for stability of the ecosystem.

Judging by what's happened with prior releases, I imagine that the project will publish a release blog post for v10 and that this specific change will be front and center.

As a member of the Node.js Collection Initiative, I am happy to make myself available as a resource to the TSC help create any/all communication about this change - and any other major changes - in v10 and future releases. If you specifically have any concerns on things that need to be addressed before the release, sharing those would be extremely helpful for us so we can communicate as effectively as possible, addressing the needs of the community.

ChALkeR commented 6 years ago

@shelleyp You are describing an example of an affected person in details — I understand your concern, but to truly determine the best path forward, we need to understand how many people on average will be negatively affected by this (and how) and how many people are expected to be affected if we don't make this change (and how).

The sad thing is that the documentation-only deprecation doesn't help. The number of packages using the old unsafe API only increases over time (yes, we actually have done measurements), and overall impact on the ecosystem caused by modules using the old API also increases over time (yes, we have dome measurements of that too, in downloads/month).

We are trying hard to track down the old Buffer API usage in popular packages to minimize the negative effect that you are speaking about. I have a list of those sorted by downloads/month, a significant number from the top list are already fixed or have have PRs opened, others are still on the roadmap.

A relatively large number of those turned out to be actual or potential security problems (varying from API footguns to leaking secrets to remote unauthorized attacker in a highly popular package), and some turned out to be undefined behaviour problems, like moxystudio/node-cross-spawn#94.

If we don't deprecate — the problem would only intensify over time, given the current trend, and someday real people are going to actually loose their sensitive data.

shelleyp commented 6 years ago

As I mentioned, security is one good reason to deprecate established functionality. And you have deprecated it, but in the past, you didn't penalize the use—you documented the issues in the API and they were discussed in various articles, etc.

So I'm curious: why the decision to turn this from a documentation into a runtime deprecation? What's happening now that wasn't happening two years ago?

You note that you discovered the modules using the Buffer constructor and many uses are being corrected. But then you say the use is increasing over time.

Have you all thought that a better option might be a tool that people can run in their own system that highlights all incorrect uses of the Buffer constructor so that people can take steps to correct the implementation? Or swap out modules?

Or even identifying Node modules that are using the Buffer constructor, and maybe work with npm to note that in the documentation for the module?

Anyway, I've made my point and don't want to belabor the issue. I understand the concerns and have documented my own. Will be interesting to see what happens.

Trott commented 6 years ago

Adding to the TSC agenda so we can hopefully get some of these things moving.

thefourtheye commented 6 years ago

Have you all thought that a better option might be a tool that people can run in their own system that highlights all incorrect uses of the Buffer constructor so that people can take steps to correct the implementation?

I recently saw one of @joyeecheung's works, CodeMod for deprecated Node.js APIs. I haven't tried that myself, but I believe it works for Buffer constructor, as per this tweet.

joyeecheung commented 6 years ago

@thefourtheye The codemods are only useful to a certain extent - it does not detect runtime types, only static types, so if the user pass in a variable, we can at best look at the names and guess the types (e.g. size in new Buffer(size) should be a number so it may be transformed to Buffer.from(size), I have not implemented those though).

For code with tests, maybe a better solution is to write something that monkey-patches the Buffer constructor, collect all the traces to find the unique ones, and analyze the actual arguments passed during the tests to provide suggestions for migration. But that's probably much more work to do..

targos commented 6 years ago

We should probably advertise ESLint's no-buffer-constructor rule.

addaleax commented 6 years ago

By the way: I have been using Node with NODE_PENDING_DEPRECATION=1 in my .bashrc for over 3 weeks now, and I don’t think I have seen a single CLI application that doesn’t emit the warning with it.

So I would really appreciate if we at least excluded TTYs from a default runtime warning, as suggested in https://github.com/nodejs/node/pull/15346#issuecomment-368990001.

Also, I would like to strongly encourage all @nodejs/tsc members to also set that flag on their development machines. It should be helpful in converting code that uses deprecated mechanisms, but also just to get a feel for how overwhelming this change is for our ecosystem.

ChALkeR commented 6 years ago

@addaleax What do you think about printing those from the app itself always, and from node_modules — at n% chance, starting with 1% and increasing over time?

The message for the former could look something like this:

Buffer() constructor is deprecated for security reasons, and appears to be used directly.
See more info at https://example.com/, and see https://example.org/ for the porting guide.

The message for the latter could look something like this:

Buffer() constructor is deprecated for security reasons, but is used by a dependency.
See more info at https://example.com/

To always observe this warning, launch with `node --pending-deprecation` or with `NODE_PENDING_DEPRECATION=1` env variable.
To opt-out from seeing this and other warnings, launch with `node --no-deprecation` or with … env variable (not recommended on dev setups).

That way, developers of popular modules would be informed even at n=1 (p=1%) but end users would be affected 100 times less at first.

To inform less popular modules, n should depend either on release or on date, and I would argue that with the presence of opt-out depending on date is the less bad choice, as we don't want people to have additional reasons not to upgrade to a new Node.js version inside a single major branch.

mcollina commented 6 years ago

I like that approach.

seishun commented 6 years ago

Deprecation warnings are already emitted only once per process. Non-deterministic warnings would just add confusion.

mcollina commented 6 years ago

Have we thought about deprecating only new Buffer(num)?

jasnell commented 6 years ago

@addaleax ... Curious... Do you know if the warnings are primarily being emitted by node child processes?

jasnell commented 6 years ago

If so.. we could always make --no-warnings a default for child processes and only emit from the parent

ChALkeR commented 6 years ago

@mcollina Yes, we have discussed that multiple times afaik, and no, that won't work — when arg is user-controlled and typically is a string, Buffer(arg) still poses a problem but won't print a warning that way.

We could specifically white-list Buffer(0) (as if only 0 is passed in it almost always means that it's hard-coded literal Buffer(0)), but that's only a small percentage of usage and won't affect things much. Also, could miss dangereous code.

Also we could white-list Buffer(string, encoding), but on older Node.js versions (4.x LTS) that's still dangereous code, so that should better to be eradicated too. Also, my analysis suggests that whitelisting that won't significnatly change the amount of observed warnings, as almost all popular modules that do that also do Buffer(arg).

So, the Buffer constructor should better be deprecated entirely. The question is how to present the warning to still reach every package out there while trying to minify users disturbance.

For example, ignoring node_modules entirely (as was discussed some time earlier) won't help — many package authors don't run the tests themselves anymore, and the only way the issue in their package could be fixed is by someone noticing it from the outside. Also, developers (and users) should be notified of potentially dangereous code in their deps.

addaleax commented 6 years ago

@addaleax ... Curious... Do you know if the warnings are primarily being emitted by node child processes?

Not that I’d be aware of, and not that I’d expect. Would that make a difference? The code base is still the one of the tool, child process or not…

Also, I don’t find it too curious… we’re runtime-deprecating a very widely used API, so I’d say it’s what one would expect to happen.

jasnell commented 6 years ago

Sorry, my use of punctuation was confusing. I wasn't saying your results were curious, I was saying that I was curious if the deprecation warnings were being emitted from child processes (as in a parent launcher script launching child processes) ... in which case, if we pass --no-deprecation by default to child processes, we can potentially hide most of the noise. If that's not the typical case, however, nevermind :-)

Trott commented 6 years ago

By the way: I have been using Node with NODE_PENDING_DEPRECATION=1 in my .bashrc for over 3 weeks now, and I don’t think I have seen a single CLI application that doesn’t emit the warning with it.

That includes node-core-utils / git node land....

ChALkeR commented 6 years ago

A list of popular packages that still need PRs.

Note: I will update this every now and then!

@nodejs/collaborators

If you file (or find) a PR (or are in the process of doing so) for any of those — feel free to label that with a checkmark. Make sure to link that PR here (by placing a link to this issue, https://github.com/nodejs/node/issues/19079, somewhere in that PR description or comment).

This list is not complete, I will add new entries after looking closer on those.

Preliminary upgrade guide is at Porting-Buffer.md (@addaleax has fixes to that that I didn't include yet, but the main part is there). Filing PRs to those packages would help greately.

benjamingr commented 6 years ago

I'll post issues on these so that the package maintainers get a notification - most of these are pretty easy fixes.

ChALkeR commented 6 years ago

@benjamingr Those packages don't strictly break. They just print a deprecation warning. At max one common one per process.


Upd: @addaleax (and anyone else seeing this comment and questioning why is it here): the issue titles filed to the repos just above were incorrect initially and mentioned «breaking» instead of a warning.

benjamingr commented 6 years ago

@ChALkeR good point, I'll change the titles of these

seishun commented 6 years ago

@ChALkeR I would drop variants 2 and 3. Listing them means tacitly approving support for outdated Node.js versions, which implies tacitly approving their use. I think we should take a stronger stance against the usage of outdated and unsupported Node.js versions. Helping maintainers keep support for them might make their life easier in the short term, but it's bad for the ecosystem in the long term.

mcollina commented 6 years ago

@ChALkeR I would drop variants 2 and 3. Listing them means tacitly approving support for outdated Node.js versions, which implies tacitly approving their use. I think we should take a stronger stance against the usage of outdated and unsupported Node.js versions. Helping maintainers keep support for them might make their life easier in the short term, but it's bad for the ecosystem in the long term.

I disagree. Dropping support for old Node.js releases has a snowball effect of semver-majors along the dependency chain. This introduces less security in the ecosystem, because old versions are left behind and those massive download numbers are on old major versions. If the change is done in a new semver-major (dropping a Node line is semver-major), it would not be picked up automatically.

YuryStrozhevsky commented 6 years ago

Guys, please type a correct link to a package you are specifying and exclude any links to my ASN1.js package - its has no dependencies from Buffer at all.

benjamingr commented 6 years ago

That was my fault @YuryStrozhevsky - sorry!

(Also as a side note, that in general we prefer to not use gendered pronouns in Node - we're not all guys here)

ChALkeR commented 6 years ago

@seishun Thought about that, that is a valid point. But there is a catch — for libraries, dropping Node.js version support is a semver-major change. And semver-major changes are propagating over the dependency chains significantly slower than minor or patch ones.

So, for libraries that are willing to land this change only in a semver-major version, I would recommend to drop support for old Node.js versions in a new version, but to land a polyfill on older branches so that users don't have to perform semver-major upgrades to receive the fix.

I tried to emphase that in my text, but perhaps that needs further refinement. Care to file a PR?


Side note: GitHub reactions are poorly designed.

_@seishun, I disagree with your opinion there (so I would label that as :heavy_minus_sign: if that was available), but I appreciate the concern and I would like to see more engagement like that (so I would label that as :+1: if that was distinguishable from :heavy_plussign: here on GitHub).

Sorry, that was me whining again about those.

ChALkeR commented 6 years ago

@benjamingr @nodejs/collaborators Also, to my experience, PRs deal with this kind of issues a lot faster, so if anyone is willing to fix things in any of those packages — please do that. I filed a bunch of those already, you can see examples linked above to this issue. Will continue doing that, but any help would be appreciated.

mafintosh commented 6 years ago

Tiny rant: I understand where this is coming from but this is causing me a lot of work to be honest. I have lots of modules that uses the Buffer constructor with significant usage not on the list (https://www.npmjs.com/package/csv-parser for example). I would much prefer to be spending my time doing something else, instead of having to clone 600+ repos (and those are just my own) and checking the source code of all of them.

mafintosh commented 6 years ago

More constructive feedback from a highly invested npm author:

  1. Where is the tools I can run to find which modules I have that needs updating?
  2. Reading the guide it recommends dropping support (major bump). That'll get loads of modules stuck in forever deprecation mode (including many of my own where I'm still on an old major) since userland uses ^ a bunch. More work for me.
  3. Node 10 is going out in April? So I have a ~month to get up to date with this while balancing work/oss/freetime/conferences (Easter is coming up).
  4. What do I do about modules that are using an old major of say tar-stream? Now I need to spend work getting those up to date using the latest tar-stream major to avoid node core deprecation warnings.

(please read the above questions is the most constructive tone possible :))

jkrems commented 6 years ago

@mafintosh I definitely share your frustration. In our own internal code we resorted to ugly workarounds like the below:

function legacyAllocBuffer(size) {
  return new Buffer(size);
}
const allocBuffer =
  typeof Buffer.alloc === 'function' ? Buffer.alloc : legacyAllocBuffer;

I do wish it would be easier to cleanly handle the migration in libraries. And personally I'm not convinced that the deprecation is worth the ecosystem pain.

ChALkeR commented 6 years ago

@mafintosh

Where is the tools I can run to find which modules I have that needs updating?

Perhaps I will be able to share a list of some sort a bit later. In https://github.com/nodejs/node/issues/19079#issuecomment-374536006 above there is a small excerpt of some of the most popular modules from that list where I have not opened PRs yet.

What do I do about modules that are using an old major of say tar-stream? Now I need to spend work getting those up to date using the latest tar-stream major to avoid node core deprecation warnings.

I have a deps database that will give me specific popular modules that need have their dependencies updated. It would require some more work and engagement from me, but I have the core part done.

jasnell commented 6 years ago

This entire issue has been going on for several years now without a clear resolution. The one thing that's painfully obvious is that runtime deprecation and CLI tools just simply do not get along and we need a different strategy. We have mechanisms such as --no-deprecation, --no-warnings, and --redirect-warnings={file} that would allow the deprecations to be hidden but CLI tool producers are not using them for whatever reason, leaving a massive burden on module developers to either update their code, or a massive burden on Node.js core to maintain legacy code that is known to be unsafe.

We need to step back and take a different approach here. Rather than targeting module developers, perhaps we need to be focusing more on CLI tool developers. Encourage them to make use of the mechanisms we have already provided for reducing the noise of deprecation warnings when not running within a dev environment.

ChALkeR commented 6 years ago

Where is the tools I can run to find which modules I have that needs updating?

I am going to prepare lists of packages per-owner that are using the deprecated Buffer API, with exact lines cited.

I will store that privately, but anyone interested would be able to receive a list of grep over all their packages from me (or other members who would be willing to share that).


Again (to anyone coming here by the direct link to this comment): the porting guide is temporary hosted over there.

benjamingr commented 6 years ago

Thanks for your feedback @mafintosh! Just to solicit some more - if you were given an absolute choice given the constraints (we want to deprecate new Buffer) and that the buffer constructor has been deprecated for a while how would you proceed?

mafintosh commented 6 years ago

@benjamingr the main issue for me given the constraints is the amount of work and time i have to put into this given the release timeline for Node 10. I don't have resources to find all the modules I need to update (@ChALkeR has been very helpful here though so let's see), and much less update the non trivial ones, where I rely on an older version of some package that used the Buffer constructor. I'm privileged that I get to spend contracting time updating some of the more popular ones but have close to no free time to spend updating other deps where there is no security issues present.

The frustration comes from getting years of free and hard oss work deprecated with a months notice (going from docs deprecation to runtime). I don't have any good solution tbh.

mafintosh commented 6 years ago

Actually what I would have done would have been to runtime deprecate the constructor in any code not in node_modules. Then users writing new code would immediately be warned not to do it. Heck I'd even make it throw.

lpinca commented 6 years ago

The frustration comes from getting years of free and hard oss work deprecated with a months notice (going from docs deprecation to runtime).

I totally understand the frustration but this is not entirely correct. #7152 was opened almost 2 years ago and it has been discussed ad nauseam.

ChALkeR commented 6 years ago

@mafintosh Deprecating with the exception for node_modules could have been an intermediate solution some time earlier, but now it would mean just delaying most of the ecosystem fixes by another half a year at least, as most packages won't see that notice ever until their users complain. Also, users should be informed of dangereous code present in their dependency chain.

mafintosh commented 6 years ago

@lpinca that pr was closed a year ago? I heard about this being implemented in this issue.

lpinca commented 6 years ago

It was reopened as https://github.com/nodejs/node/pull/15346.

mafintosh commented 6 years ago

I don't watch node core. If I had to keep up with node core I wouldn't have time to write the 650+ modules that puts me in trouble now 😁.

ChALkeR commented 6 years ago

I have prepared lists of deprecated Buffer API usage per-author, with exact lines in npm packages sorted by their downloads count.

Those are hosted in a private repo for reasons, anyone who wants theirs — ping me =). Preferrable way of sharing those for me is over Gitter — easy to verify who is who that way (and won't pollute my inbox).

@nodejs/tsc also has access (if I go missing).