Closed addaleax closed 7 years ago
@seishun Here's the PoC:
'use strict';
const addon = require('./build/Release/addon');
class FastFoo extends Uint8Array { }
FastFoo.prototype.constructor = Foo;
Foo.prototype = FastFoo.prototype;
function Foo(arg) {
if (typeof new.target === 'function' && new.target !== Foo)
return addon.fromString(arg, new.target.prototype);
return new FastFoo(...arguments);
}
Object.setPrototypeOf(Foo, Uint8Array.prototype);
Object.setPrototypeOf(Foo.prototype, Uint8Array.prototype);
Foo.prototype.toString = function toString() {
return Buffer.from(this.buffer).toString();
};
// User's extending class.
class Bar extends Foo {
constructor() {
super(...arguments);
}
toStringHex() {
return Buffer.from(this.buffer).toString('hex');
}
}
const b = new Bar('a');
console.log(b);
console.log(b.toString());
console.log(b.toStringHex());
This is written as a native module, where fromString()
is:
void FromString(const FunctionCallbackInfo<Value>& args) {
Isolate* isolate = args.GetIsolate();
Local<Context> context = isolate->GetCurrentContext();
assert(args[0]->IsString() && "first argument must be a string");
assert(args[1]->IsObject() && "second argument must be a object");
Local<Object> buf = node::Buffer::New(isolate, args[0].As<String>()).ToLocalChecked();
char* data = node::Buffer::Data(buf);
size_t len = node::Buffer::Length(buf);
char* dup = new char[len];
strncpy(dup, data, len);
Local<ArrayBuffer> ab = ArrayBuffer::New(
isolate, dup, len, ArrayBufferCreationMode::kInternalized);
if (dup == nullptr)
ab->Neuter();
Local<Uint8Array> ui = Uint8Array::New(ab, 0, len);
ui->SetPrototype(context, args[1].As<Object>()).IsJust();
args.GetReturnValue().Set(ui);
}
I think this would allow extending from Buffer
correctly.
@ChALkeR
Buffer
had that capability. Since this can't possibly be a security issue ATM b/c it's not possible, and b/c the case is more specialized I think it would be reasonable to allow that. Though it might feel strange to allow the full set of parameters for the extended Buffer
, but not for the normal constructor. That's my only concern.@seishun add this to binding.gyp
in the same folders as those two:
{
"targets": [{
"target_name": "addon",
"sources": [ "main.cc" ]
}]
}
also wrap the native code with this:
#include <v8.h>
#include <node.h>
#include <node_buffer.h>
using namespace v8;
// place c++ code from above here
NODE_MODULE(addon, init)
Then build with:
$ node-gyp rebuild
If you're using a custom build of node then do this:
$ /path/to/node/node $(which node-gyp) rebuild --nodedir=/path/to/node
Let me know if that works for you.
@seishun mind gist'ing your code? or hit me on IRC and we'll work through it. (edit: see below)
@seishun i'm an idiot. First rename FromJust
to FromString
and then the end of the .cc
file should be:
void init(Handle<Object> exports) {
NODE_SET_METHOD(exports, "fromString", FromString);
}
NODE_MODULE(addon, init)
Sorry about that.
@trevnorris It seems it might work (although you'd have to add cases for all Uint8Array argument combinations to the C++ code), but:
@trevnorris @seishun wait… wait… wait… is this actually all that we need for proper subclassability:
class ExtensibleBuffer extends Buffer {
constructor(...args) {
super(...args);
Object.setPrototypeOf(this, new.target.prototype);
}
}
?
@seishun
- Are you 100% sure that it won't have any edge cases?
Nope. There are no apparent issues, but I'm sure the community would find something.
- Do you know what performance impact this will have compared to a regular ES6 class?
In node v6 switching new FastFoo(...arguments)
to new FastFoo(args0, args1, args2)
(I'm sure it's faster in newer V8) shows that any overhead is in the margin of error.
- I remember you mentioning that you wish to simplify the internal implementation of Buffer (and I totally share that sentiment). Adding such hacks seems to go in the opposite direction.
I have two goals. One is to make internal implementation simpler. The other is to make Buffer extendable. Since simplifying internals isn't in the short-term I figured it would be beneficial to go ahead and make it extendable so users can being to write their code expecting those simplifications to occur. This change would be future-proof of sorts.
@addaleax
wait… wait… wait… is this actually all that we need for proper subclassability:
More or less. Honestly I didn't realize this until writing the above example.
My proposal: leave Buffer alone. Gently nudge people to stop using it altogether. Add encodings for core methods such as fs.readFile() and streams: 'uint8', 'int16', 'arraybuffer' etc which provide unmodified Uint8Array etc objects.
The fact that node has a Buffer
at all is a historical accident because people needed to deal with binary data and v8 at that point did not yet have array buffer objects. Longer term, we should make all of core support native array buffers and leave Buffer
alone as a legacy API that we can gradually move away from. I don't see the point of extending Uint8Array when we already have Uint8Array. Core methods could return plain objects directly.
So, just to avoid any confusion here: All Buffer
objects are already Uint8Array
s, the only difference is an extended prototype.
Longer term, we should make all of core support native array buffers
:+1: Fwiw I’ve started to look for methods that can accept generic Uint8Array
s as input on one core module and am planning to do the same for other modules. The native bindings layer doesn’t even see any difference between Buffer
s and Uint8Array
s, so this work pretty well everywhere.
Regarding output from core methods, I am not sure adding more (pseudo-?)encodings to core methods is worth it, since converting between typed arrays is already pretty easy as it is. Also, I’m not sure what impact this kind of change would have on streams, so I will at least leave my own hands off that.
@substack AFAIK we are not currently moving forward with the decision to deprecate anything. @nodejs/ctc can I get a confirmation on that
On Mon, Dec 26, 2016, 4:39 PM James Halliday notifications@github.com wrote:
Upgrade to 7.2.1. This was reverted in node core: #9529 https://github.com/nodejs/node/pull/9529 Adding new won't get us anywhere because THAT will be deprecated in 8.0
— You are receiving this because you are on a team that was mentioned. Reply to this email directly, view it on GitHub https://github.com/nodejs/node/issues/9531#issuecomment-269243457, or mute the thread https://github.com/notifications/unsubscribe-auth/AAecVwJeMiUYKxqFp7v_SjtBixD0sYzOks5rMDQtgaJpZM4Kt-70 .
AFAIK we are not currently moving forward with the decision to deprecate anything
I don’t think we have consensus on anything right now, so, yeah. Basically.
Looking at the API in larger terms. I'd like to see us support all typed arrays, and ArrayBuffer
. One place specifically for the latter is when I want to read in a large file into memory all at once. This would be possible if I could pass an ArrayBuffer
to fs.read()
.
As for what happens to Buffer
. It has a lot of heavily optimized methods, and other goodies that make it much easier to work with data in node. So Buffer
itself won't be going anywhere. I would like to see that it require the use of new
eventually, but if that's not possible then eh. We can use the example snippet above to get proper support with class
.
A decision is going to have to be made about what (if anything) to change about new Buffer()
in 8.0.0 from a secure-programming perspective.
Given the...uh...lack of consensus, this decision is likely to have to be made by the CTC. (There's an additional question of what elements of any changes to backport to LTS lines, but that may not have to be decided by the CTC. The LTS folks may be able to sort that out once a decision is made for 8.0.0.)
This issue is long, and there are lots of external resources that supply info too. I guess it is probably good to at least list out the reasonable options (and omit ones no one wants) to at least have an idea what needs deciding.
With that in mind, I think these are basically the choices:
new Buffer()
new Buffer()
new Buffer()
Am I missing any other options?
I seem to recall @jasnell had an idea of how to do zero-fill without the perf impact that held us back from doing it previously.
I do not but I know that at one point @bnoordhuis has mentioned doing some form of initialization-on-demand where chunks of the allocated memory would be zero-filled incrementally or on read as necessary. I'm not sure if there was ever any effort put into investigating the feasibility of the idea, however.
At this point I am -1 on making any significant changes in the 7.x branch.
In 8.x, I would like to switch to zero-fill-by-default for Buffer(n)
and new Buffer(n)
. The associated performance hit would likely help encourage developers to move over to the new constructors faster.
In 9.x, I would like to see the runtime deprecation for both new Buffer()
and Buffer()
-without-new. That said, Buffer()
-without-new should continue to work.
@jasnell Can you explain why you're opposed to changes in 7.0.0 but want changes in 8.0.0? Not trying to persuade you otherwise or anything. Just curious what your reasons are. Is it just a general sense that 7.0.0 is coming along too soon? Or is there something more concrete going on?
We're only just over 3 months out from 8.0.0, 7.x is not going to be LTS, and I'd like to give more time for the ecosystem to prepare. In short, I see no reason to rush it.
Wow...long day or something... I was thinking 7.x was upcoming, but uh, yeah, obviously that's not the case. Updating my previous comment on what-the-options-are to say 8.0.0.
I support shipping zero-fill-by-default in 8.x. Ideally, we will backport zero-fill-by-default to supported LTS lines, so that new code written assuming Buffer(n)
zero-fills will run securely on older versions of Node.
Also, if we ship zero-fill-by-default, the primary security issue with the Buffer()
constructor (the information leak) is fixed, so I don't think a run-time deprecation in 9.x is necessary.
Docs-only deprecation has been sufficient to discourage new use of new code from using Buffer(n)
in my anecdotal experience. The docs are very, very clear that Buffer()
is to be avoided. And even if it does get used, we have zero-filling so no information leak is possible.
@jasnell If you support runtime-deprecating the Buffer constructor, then why do you propose to do it in the next major version after introducing zero-filling? Zero-filling should be introduced no earlier than the runtime deprecation, due to the security concerns discussed before.
Also, introducing a performance hit is not the nicest way to encourage developers to move over to the new constructors. It will likely result in one of two outcomes for a given developer:
new Buffer
. I bet they would prefer a deprecation warning that would inform them about the performance hit immediately after upgrading, coupled with the --trace-deprecation
flag that would help them find the exact lines they need to update.I hate to add to the ctc-review
pile but we need to make a decision about what is going to happen to the Buffer constructor in Node.js 8.0.0. Options appear to be:
As with the ES modules, none of the options are great so I guess we should provide a lot of time. If the modules discussions are any indicator, the multiple-mediocre-options discussions can take a lot of time to go over everything carefully.
Adding ctc-review label... @nodejs/ctc
@Trott I thought about that again, and I don't think that we should be bounded by those three options.
SlowBuffer
now (#8182), but that raises another issue — unsuspecting users of those packages and tools get those warnings, and think that something is broken, which it's most probably not. In most (>50%) cases the Buffer usage is fine, the original issue comes from the fact that it's not fine in some percentage of the cases and the huge overall usage, combining into a lot of potentially unsafe places. But each distinct case is more likely to be fine than not, and the users don't want to be spammed with those warnings.Buffer(10).fill(0)
which would make some users to rewrite that to just Buffer(10)
.So, how about this: target runtime deprecation (to the most possible extent) in 9.0 (or later), but announce that with 8.0 release, introducing a new flag in 8.0, e.g. --buffer-deprecation
that would enable that deprecation early so that users who want to see those warnings will get them and report the issues to affected packages, giving them 6 months to migrate without haste.
Thoughts?
I like the idea of an opt-in flag. That looks like the --deprectate-soon
proposition from @jasnell which is currently closed.
I don't think a flag will do any good.
Few people will know it exists, fewer still will care enough to enable it.
The crucial part here is a separate section in the 8.0.0 release notes that states that the old methods will be runtime-deprecated soon and urges to stop using those, explains the background for that, and presents the flag as the helper tool to find the code before an actual runtime deprecations emerges that starts affecting all users.
I expect some users who build upon affected libraries to be running their code with that flag and reporting issues to affected libraries. That ususally works.
It's clear that some module authors don't care about docs-only deprecation and reject PRs fixing such usage.
We don't need to cover all the cases before the runtime deprecation lands, we need to just significantly reduce the impact of such a runtime warning for unsuspecting users and give module authors time to migrate.
The crucial part here is a separate section in the 8.0.0 release notes that states that the old methods will be runtime-deprecated soon and urges to stop using those, explains the background for that
The 6.0.0 release notes already mentioned that the constructor is deprecated, and the docs give a detailed explanation why. Would it make a huge difference if the release notes warned that there is going to be a warning in 6 months, and a tool was provided to find code that is going to cause the warning? Maybe, but it seems unlikely to me. People generally don't care about what happens to their code far in the future, especially if it's just a warning.
Even if the majority of users don't care about it, I think such a flag would be beneficial. I would personally use it on my applications to spot libraries that use deprecated APIs and open issues about it.
I agree that it would be beneficial, although the relevant PR didn't get much praise.
I disagree that the runtime-deprecation should be postponed in hopes that the flag will make a difference.
I think the issues with my proposed PR were mostly around the naming of the flags. The general concept seemed to have support and I can easily revisit it with a new PR that addresses those shortcomings. That said, in the case of Buffer
, my thinking is that we should be more aggressive about it. We now have mechanisms to silence warnings if necessary (env var, --redirect-warnings
and --no-warnings
flags) for those who really do not want to be bothered by such things. This is why we have a deprecation process to begin with.
This issue continues to go around-and-around in circles.
Runtime deprecation is not a good idea for all the reasons previously discussed. We even rolled back the runtime deprecation in the middle of the v7 line because it was way too disruptive. Before trying the exact same thing again, can we discuss why we think it will be any different this time?
@ChALkeR You continue to repeat the idea that zero-filling buffers will cause new security issues, but you haven't responded to my suggestion in this comment:
I support shipping zero-fill-by-default in 8.x. Ideally, we will backport zero-fill-by-default to supported LTS lines, so that new code written assuming
Buffer(n)
zero-fills will run securely on older versions of Node.Also, if we ship zero-fill-by-default, the primary security issue with the
Buffer()
constructor (the information leak) is fixed, so I don't think a run-time deprecation in 9.x is necessary.
Adding zero-filling, and backporting it to all LTS releases causes no ecosystem disruption, and introduces no new security issues.
We even rolled back the runtime deprecation in the middle of the v7 line because it was way too disruptive. Before trying the exact same thing again, can we discuss why we think it will be any different this time?
No, it wasn't reverted because it was too disruptive. It was reverted because it wasn't sufficiently justified. This time, it will be.
@ChALkeR You continue to repeat the idea that zero-filling buffers will cause new security issues, but you haven't responded to my suggestion in this comment:
He has, though. https://github.com/nodejs/node/issues/9531#issuecomment-262003359
@seishun, @jasnell I will support runtime deprecation for 8.0, given that we won't revert that again to nowhere. But I don't think that it would be accepted by everyone, and I think that it might cause too much distruption for the end users atm (the reason why we had do revert even the buffer-without-new runtime deprecation, which is just just a part of what we should actually deprecate).
@seishun No, it wasn't reverted because it was too disruptive. It was reverted because it wasn't sufficiently justified. This time, it will be.
"Justifying" a huge breaking change won't make said breaking change any less disruptive to the ecosystem. The Web is WAY bigger than Node.js and it somehow manages to avoid making breaking changes. Node.js can and should do the same.
This thread is quite long and I am kind of lost so I have a few questions.
@mikeal ... as has been pointed out, zero fill by default carries risks for users on older versions if modules come to depend on assuming zero-filling is the default. We need to come up with a good way of nudging users to the new safer constructor APIs without (a) incurring additional risk and (b) incurring the wrath of developers due to overly noisy deprecation warnings.
For what it's worth, the next version of standard
will treat uses of the deprecated Buffer()
constructor as an error. I'm sure that maintainers of other popular style guides like Airbnb and Google could be convinced to add the same rule. This way, module developers are warned about the issue, without annoying regular users who can't even do anything to fix it.
zero fill by default carries risks for users on older versions if modules come to depend on assuming zero-filling is the default
Yes, but aren't you considering entirely breaking this API for those users anyway? Whatever behavior they might rely on here is sort of irrelevant if the alternative is breaking them anyway.
Yes, but aren't you considering entirely breaking this API for those users anyway?
No, runtime deprecation doesn't break anything. It displays a warning on first use.
who can't even do anything to fix it.
They can submit an issue or a PR or set NODE_NO_WARNINGS=1
.
No, runtime deprecation doesn't break anything. It displays a warning on first use.
Spewing unexpected text into the stderr of a Node CLI program most certainly does break things. Pretending otherwise is irresponsible.
They can submit an issue or a PR or set NODE_NO_WARNINGS=1.
If the deprecated usage is in a deeply-nested dependency, then PRs may need to also be submitted to every package in the chain of dependencies all the way to the top package.
It's common for packages to depend on a package that is one or two major versions behind. In that case, a big refactor may be required just to update to the latest dependency and make the warning go away.
And, NODE_NO_WARNINGS=1
is hardly a realistic solution for end-users.
Spewing unexpected text into the stderr of a Node CLI program most certainly does break things. Pretending otherwise is irresponsible.
Yes, it can cause breakage in certain cases (just like any other deprecation warning), but I think that's not what @mikeal referred to as "entirely breaking this API".
And, NODE_NO_WARNINGS=1 is hardly a realistic solution for end-users.
Why not? Could you clarify who you're referring to as "end-users"?
I think this is part of the problem:
"Deprecation is no big deal":
"Deprecation would be extremely disruptive":
There is a real cost that the deprecation advocates simply don't appreciate.
We all contribute to Node in different ways. I'm not implying that one type of contribution is more valuable than another. However, I would appreciate some acknowledgement of the real pain this change is likely to cause to others members of the community, instead of constant dismissals of our concerns.
To be absolutely certain: I never said that deprecation is "no big deal", so please do not characterize my position in that way. For me it's not about coming up with a good solution, it's about coming to an agreement around the least bad approach that will be effective at moving the ecosystem in the direction we need them to go without incurring additional risk.
To be absolutely certain: I never said that deprecation is "no big deal", so please do not characterize my position in that way.
Of course. The categories are probably better named "deprecation advocates" and "deprecation detractors".
A few things we need to understand in order to have a production conversation about tradeoffs.
I think the problem we're running into here is that some people think that this sort of a break is a grey area, that it effectively is not as extreme as "real" breaking changes. That is not true, and we cannot afford to pretend it is true.
Many people in this conversation understand this and may still think this break is worth it. We can have a productive conversation about the tradeoffs from that point forward.
What we can't do is continue to argue about the severity of this kind of deprecation. It's a breaking change, it breaks a bunch of stuff in the ecosystem, it will keep people from upgrading, and we need to admit that and plan appropriately if this is still something people believe is worth it.
it's about coming to an agreement around the least bad approach that will be effective at moving the ecosystem in the direction we need them to go without incurring additional risk.
Agreed.
I think that what @feross and a few others are saying is that the breaks due to runtime deprecation are actually greater than the breaks we'll see from zero-filling the API we want to deprecate. Both are a breaking change, but one is a bit more future proof and probably less destructive.
Of course. The categories are probably better named "deprecation advocates" and "deprecation detractors".
And just to be fair, everyone is advocating a breaking change to the API.
If we take the approach of zero-filling by default, then we would need to see a concerted effort on the part of the ecosystem to help protect users from the associated risks. For instance, modules that do assume zero-fill by default may want to proactively check that they are running on versions of Node.js that do zero-fill by default, and warn users accordingly when that requirement is not met. Version sniffing sucks but it would be the only way of knowing for certain that users are not being put at risk.
Aggressive linting would also be helpful. I had proposed to ESlint adding a rule around using the new construction methods but they declined given that the rule is very Node.js specific. It would be good to get some traction around that tho and I'm happy to see that standard
is moving in that direction.
Eventually I would like to see a runtime deprecation when using Buffer()
or new Buffer()
but that does not need to be any time soon so long as we see positive movement in the right direction here.
If we take the approach of zero-filling by default, then we would need to see a concerted effort on the part of the ecosystem to help protect users from the associated risks
So, we have @feross committed to adding a rule to standardjs. We can also reach out to AirBnB about adding the same to their widely adopted styleguide and tools.
We could reach out to Visual Studio Code and Atom to see if they can add similar warnings to their editor's default Node.js support.
We can also reach out to Snyk to see if they can add a warning when code and even dependencies do the same.
In today’s CTC meeting we discussed reverting the
DeprecationWarning
for callingBuffer
withoutnew
that was introduced inv7
(PR up here), and it became clear that we need to come up with a long-term plan on what exactly we want to achieve, how to do that and improve our messaging about it, both before and after our actions. I’ll try to sum up what exactly we are talking about; obviously, I am somewhat biased, having been involved in plenty of the previous discussion here. (This has still gotten pretty long btw, so I hope a lot of people will find the information in here useful enough to warrant a Wall of Text.)The
Buffer
constructor has the usability flaw that it accepts input with different type signatures, sonew Buffer('abcdef')
andnew Buffer(100)
will both return valid buffers, and in the latter case, theBuffer
will contain 100 bytes of unitialized memory. This is a security problem for two reasons:Buffer
constructor where a string is expected but a number is actually passed, uninitialized memory will be returned:Passing the value
100
here will return a slice of memory that may contain garbage, but generally can contain any value previously stored in memory, including credentials, source code, and much more. @ChALkeR has a pretty good write-up of this: https://github.com/ChALkeR/notes/blob/master/Buffer-knows-everything.mdAgain, @ChALkeR has a very-good write-up on these security issues at https://github.com/ChALkeR/notes/blob/master/Lets-fix-Buffer-API.md. It predates the current
Buffer.alloc()
/Buffer.from()
situation, but it contains a helpful FAQ with answers to questions like “Why not just makeBuffer(number)
zero-fill everything by default?”.So far, in Node v6.0.0 the safer
Buffer.alloc()
/Buffer.from()
API was introduced and later backported to the v5.10.0 and v4.5.0 releases. Additionally, v6.0.0 came with a documentation-only deprecation of the oldBuffer()
API.In June, https://github.com/nodejs/node/pull/7152 was opened, which seeks to deprecate the old
Buffer()
API using a runtime deprecation, i.e. printing a single warning per Node process whenBuffer()
ornew Buffer()
is executed for the first time. Currently, that PR is still open. A reduced version of it, https://github.com/nodejs/node/pull/8169, was landed as a semver-major change in v7.0.0, that emits and displaysDeprecationWarning
s for uses ofBuffer()
only, but excludes uses ofnew Buffer()
.I had summarized the goals and possible actions before that decision was made in https://github.com/nodejs/node/pull/7152#issuecomment-241355246 ¹; And @jasnell has written a then-current long-term plan in https://github.com/nodejs/node/pull/7152#issuecomment-240753218 that would include runtime deprecations of
new Buffer()
in v8.0.0 and later actual breaking changes to theBuffer
constructor.The reason for this distinction was trying to keep the possibility of making
Buffer
a proper ES6 class at some point in the far, far future open, which would mean that callingnew Buffer()
may always work. (Effects of turningBuffer
into a class would be proper subclassability and breakingBuffer()
withoutnew
. It is, however, completely possible to add a separate class to the API that would behave like the currentBuffer
implementation does, only with these differences.)As a result of that deprecation for
Buffer()
withoutnew
in v7.0.0, significant pushback from well-known members of the community ensued, both in the threads on https://github.com/nodejs/node/pull/7152 and https://github.com/nodejs/node/pull/8169. On the one hand, it became clear that we failed in our messaging to make clear that the primary motivation for that change was helping our users avoid serious security issues; on the other hand, the added deprecation warning seemed to be incongruent with the expectations of stability and backwards compatibility that module authors and consumers have, as far as Node core is concerned.As a result of this, the CTC decided to consider reverting the deprecation warning, possibly temporarily, and the corresponding PR is in https://github.com/nodejs/node/pull/9529. The decision on that has yet to be made, but the desire has been expressed to reach a decision soon to limit the number of v7.x versions with possibly incongruent behaviour.
From following the discussions, it is obvious that the path forward is a contentious issue; right now, the opinions range from never introducing a runtime deprecation for any version of the
Buffer
constructor, to applying one for all uses of it at the next semver-major release in v8.0.0.The strongest and most frequently expressed argument for fully runtime-deprecating the
Buffer
constructor soon remains that users may not be aware that parts of their application use an unsafe API and should be warned about that.On the other side, the warning itself is perceived as a very disruptive change to the ecosystem, suggesting that it is definitely worth exploring alternative ways to reduce the usage of both
Buffer()
andnew Buffer()
./cc @nodejs/collaborators
¹ It may or may not be obvious from the way I articulate my thoughts here – I try to stick to stating facts – but in hindsight, I regret writing it this way.