anonrig commented 1 year ago

History

Starting from November 2019, we migrated core modules to primordials
- Ref: https://github.com/nodejs/node/issues/30697
On February 2022, TSC took a vote and came to a conclusion to Use primordials in the error path only if there are no significant perf regression.
- Ref: https://github.com/nodejs/TSC/blob/main/votes/2022-02-21-0.json

Proposal

Due to the performance and several other concerns I propose removing all primordials from the Node.js project. Remember that this is a living document and I'll try to update the following list upon feedback from the community and the TSC.

Pros

It ensures that Node.js internals are not tampered or acceptable to prototype pollutions.

Cons

Any prototype pollution in Node.js core is not accepted as a vulnerability in our security/bug bounty program.
- We use primordials with the purpose of security, but does not include it in our bounty program, which contradicts the notion of security in Node.js
It creates a friction for optimizing performance, and requires the contributor to benchmark the code with and without the primordials, causing in reality to regress over the performance.
- One solution for this is to have a performance regression catcher like the benchmarking.nodejs.org but IMHO that's fighting fire with fire.
V8 doesn't always optimize the primordials efficiently and causes performance regressions, and what Turbofan optimizes are subject to change. This requires constant benchmarking for each V8 release.
- We have a section that mentions known performance issues but we never follow them. For example, if you search for ArrayPrototypePush in Node.js core, you'll find a lot of examples and usages of primordials with known peformance issues. Reference to the documentation: https://github.com/nodejs/node/blob/HEAD/doc/contributing/primordials.md#primordials-with-known-performance-issues.
- There are several issues around primordials in Node.js performance repository. Reference: https://github.com/nodejs/performance/issues/70
It creates a higher bar for contribution to Node.js. The primordials.md document is 775 lines on top of our existing documentation, and we don't follow them even though it has explicit examples of what NOT to use - https://github.com/nodejs/node/blob/main/doc/contributing/primordials.md
- The issue of disallowing certain primordials can be solved with linter rules, but IMHO that's fighting fire with fire.
There is an existing proposal in TC39 called proposal-symbol-proto for mitigating prototype pollution.
- Ref: https://github.com/tc39/proposal-symbol-proto

Suggestion

I recommend the removal of primordials from the Node.js project. I strongly believe that it causes more harm then the intended behavior effecting the future of the project causing performance regressions and friction for new contributors.

Appreciate, if you can leave a 👍 or 👎 to this issue.

cc @nodejs/tsc @gireeshpunathil @mcollina @ljharb @BridgeAR

RafaelGSS commented 1 year ago

Can we have a draft to see exactly how much performance improvement are we talking about?

aduh95 commented 1 year ago

I don't think this proposal is reasonable, primordials are a useful tool for reliability of the software, and most of the time have no impact on performance. Also as you mentioned, the TSC has already voted for the use of primordials in the error path, so your proposal should be compatible with that vote.

Any prototype pollution in Node.js core is not accepted as a vulnerability in our security/bug bounty program.

That's not a con, it's just a fact.

We use primordials with the purpose of security, but does not include it in our bounty program, which contradicts the notion of security in Node.js

We do not use primordials with the purpose of security.

We have a section that mentions known performance issues but we never follow them.

It's really unfair to state "we never follow them", in reality those changes tend to be watched carefully and not land if perf regression is shown.

For example, if you search for ArrayPrototypePush in Node.js core, you'll find a lot of examples and usages of primordials with known peformance issues.

Well what does it tell you? Not every part of lib/ is on a hot path, sometimes reliability is more important than performance.

The issue of disallowing certain primordials can be solved with linter rules, but IMHO that's fighting fire with fire.

It's already in place, we have the avoid-prototype-pollution rule that forbid the use of some primordials.

There is an existing proposal in TC39 called proposal-symbol-proto for mitigating prototype pollution.

Ref: https://github.com/tc39/proposal-symbol-proto

That's not a con either.

benjamingr commented 1 year ago

I feel strongly that we should keep primordials when it is required for web standards (for example EventTarget should be reasonably tamper proof because not doing so would violate spec semantics).

I also feel strongly we should keep primordials in "critical" error paths that must never throw.

I am +1 on removing primordials everywhere else, multiple people mentioned they are a burden to contribution and they make the code less readable, debuggable and fun to work on, even if performance was not an issue. They create friction in PRs, cause CI reruns and often needless churn. To be clear - it's not that they're not valuable it's that their value is greatly outweighed by their burden IMO.

It's important to echo @aduh95 's point - primordials are about reliability and are not a security mechanism, it does not provide tamper proofing nor does it sandbox anything.

@aduh95 the fact a TSC vote happened doesn't mean the TSC can't change its mind? Personally I'm a lot less pro-primordials than I was 2 years ago given feedback people several people gave me about how it impacts their contribution experience. Also the TSC itself changes in composition.

ljharb commented 1 year ago

A JS platform that falls apart when someone does a normal JavaScript action would be an embarrassment.

Primordials may not be the best or only way to solve this problem, but simply not solving it is something I would hope is deemed wildly unacceptable.

aduh95 commented 1 year ago

@aduh95 the fact a TSC vote happened doesn't mean the TSC can't change its mind? Personally I'm a lot less pro-primordials than I was 2 years ago given feedback people several people gave me about how it impacts their contribution experience. Also the TSC itself changes in composition.

I'm not sure, this has never happened before AFAICT. I still think Yagiz should try to draft a proposal that's compatible with it, or at least that does not overlap with it, so the discussion can be more productive. Depending on the issue of that proposal, you could consider overriding the previous vote, but now's not the time IMO.

ronag commented 1 year ago

A JS platform that falls apart when someone does a normal JavaScript action would be an embarrassment.

This is more of an opinion than a statement of fact. IMHO even though there are clear benefits, the cons outweigh the pros.

@anonrig I don't think the current consensus of only using primordial for the error path is too bad.

benjamingr commented 1 year ago

I'm not sure, this has never happened before AFAICT

I think trying things and learning from them is a strength, the TSC was super opposed to promises at one point for example to the point people were actively mocking me for advocating them (maybe it was the CTC? it was a long time ago when that was still a thing :)). I change my mind around a lot of things personally though admittedly most of those things weren't a TSC vote.

I think it would be productive to focus on arguments for and against usage of primordials (generally and in specific cases) though to be clear if my PoV doesn't reach consensus (primordials in critical error paths and web standards and not elsewhere) I'd be fine with that.

A JS platform that falls apart when someone does a normal JavaScript action would be an embarrassment.

I don't think that's a productive statement especially given none of the userland packages use them. So I don't think we're giving a meaningful guarantee with them anyway. Additionally I don't like the language "would be an embarrassment" since "to whom?" and also it assumes other peoples' feelings about stuff.

Primordials may not be the best or only way to solve this problem, but simply not solving it is something I would hope is deemed wildly unacceptable.

The issue isn't "primordials solve nothing so let's remove them" it's "the new problems they create in our code and the friction they add are maybe not worth the thing they're solving for us".

ronag commented 1 year ago

@benjamingr +1 on that

ljharb commented 1 year ago

@benjamingr all of my userland packages do, in fact, use them. I apologize for the choice of language; how better could i indicate how serious I think this is?

benjamingr commented 1 year ago

@ljharb

all of my userland packages do, in fact, use them.

First of all - 😮 😅 Second of all - how? What if someone changes a builtin prototype before your packages' code is loaded? Third of all - I suspect that for most popular packages that is pretty rare.

how better could i indicate how serious I think this is?

"Hey, I feel strongly that we should keep using primordials. They are beneficial because they prevent issues like A,B or C which are important to users and people have run into (e.g here are some issues people opened complaining about it D, E or F). I think the project would lose from removing them."

Basically just objective unambiguous facts about why you think your opinion is correct would have the best chances of impacting the peoples' (and the TSC's) opinion, ideally with numbers.

ljharb commented 1 year ago

What if someone changes a builtin prototype before your packages' code is loaded?

the nature of javascript is such that you can't defend against that, but, if your code is first-run, you can cache everything you need.

how?

I use https://npmjs.com/call-bind and https://npmjs.com/get-intrinsic to achieve it.

pretty rare

I agree! but that doesn't mean that node shouldn't be robust against that. Lots of use cases for node have zero dependencies - especially with the test runner, where deleting builtins is actually a more common thing to simulate different environments and test an implementation against it, and robustness in that case is important

"…"

Thanks, I appreciate that feedback and will try to incorporate it.

aduh95 commented 1 year ago

@benjamingr it's all fine if TSC members change their mind, I do that all the time too, highly recommend it, past Antoine was a complete idiot most of the time. Still the TSC decision needs to be somewhat final otherwise we cannot move forward. I don't know if the TSC charter says anything about it, but anyway, it's going to create off-topic discussions to try to override an existing vote, I think we can discuss primordials while accepting the outcome of the previous vote.

My personal opinion on primordials is that it provides more upside than downsides; I don't know if it's because I lack empathy or something, but I don't think it creates a lot of friction in PRs, and it doesn't seem to me that it's a burden to contribution. If anything, I think it makes the code more fun and debuggable, but I've learned that I might be in the minority here.

benjamingr commented 1 year ago

@ljharb

the nature of javascript is such that you can't defend against that, but, if your code is first-run, you can cache everything you need.

Yes but if you load "Package A" that changes a prototype then "Package B" that comes following it will get a polluted prototype. This can actually be useful (e.g. for instrumentation) and it would make certain use cases impossible (e.g. people writing instrumentation "deeply" hooking a server library).

I do think it would be very helpful to be able to provide this guarantee (though for me it's only meaningful if it's a guarantee for actual user code which is likely to use many dependencies). It would be ideal for the project to promote a proposal to address this at the ECMAScript level (which is where I believe this naturally belongs) rather than attempt to ship "semi correct" approaches that provide very fragile guarantees.

@aduh95

@benjamingr it's all fine if TSC members change their mind, I do that all the time too, highly recommend it, past Antoine was a complete idiot most of the time. Still the TSC decision needs to be somewhat final otherwise we cannot move forward. I don't know if the TSC charter says anything about it, but anyway, it's going to create off-topic discussions to try to override an existing vote, I think we can discuss primordials while accepting the outcome of the previous vote.

The TSC vote about primordials was a year and a half ago, quite a respectable amount of time in software I think? I'm personally in favor of keeping primordials in error paths (and was not part of that vote) but I think it's fine to raise the issue again and see if the opinion of folks changed.

My personal opinion on primordials is that it provides more upside than downsides; I don't know if it's because I lack empathy or something, but I don't think it creates a lot of friction in PRs, and it doesn't seem to me that it's a burden to contribution. If anything, I think it makes the code more fun and debuggable, but I've learned that I might be in the minority here.

I acknowledge that for some people (like yourself and probably others) they don't create a burden to contribution. I don't understand how they make stuff more fun or debuggable though :D

I've had several contributors complain about it to me in the past. I personally find them a bit tedious (a bit more so recently) but I've reviewed, approved and helped land a bunch of primordials changes before that perspective shift. Even before that it was "something I put up with to contribute" like GYP, different/weird debugging/testing flows etc.

ljharb commented 1 year ago

@benjamingr yes, that's why the first thing you load in an application, to be safe, should be the code that sets up the environment - shims/polyfills, things that can be cached like get-intrinsics, etc.

I 100% believe this should be handled in TC39 itself, but I don't think node should give up this important robustness property in the meantime.

JakobJingleheimer commented 1 year ago

I would love to not need primordials (not least because they're confounding to new contributors—if Antoine wasn't here, we'd be kinda screwed). However, they provide a very important function that I think we can't just decide to stop caring about. If we had some way to not need them, I'm all ears!

GeoffreyBooth commented 1 year ago

I don’t think the current consensus of only using primordial for the error path is too bad.

This is the rule we’re supposed to be following? I thought I was supposed to use primordials everywhere, for any method for which we had a primordial available. You’re telling me I could have limited primordials to just error paths this whole time?

mcollina commented 1 year ago

Any prototype pollution in Node.js core is not accepted as a vulnerability in our security/bug bounty program.

This is incorrect. Any prototype pollution is accepted as a vulnerability by our threat model, i.e. causing a modification of a prototype as a side effect of calling a function in Node.js core. However, internals that are not hardened against a prototype pollution that already happened are not vulnerabilities, because they are unexploitable. Hopefully this distinction is clear.

isaacs commented 1 year ago

Primordials are a "least bad" solution, and even then, only because there are so few solutions available for the problem they're intended to solve.

Reading through this, and some of the other discussions around primordials, and having done some work to in the efforts to get minipass and glob into core, I think a few things should be uncontroversial. (If you disagree, or think I'm overstating something or making an unfair generalization, I'm curious about it! That's not my intent.)

Primordials present a significant barrier to contribution for many JavaScript authors, even some who are very experienced, skilled, and could provide value to the Node project.
It would usually be bad if userland programs could modify intrinsic objects and cause the Node platform to behave incorrectly. (There are some cases where it's surprising that modifying an intrinsic used by Node isn't reflected in the platform behavior, but on balance, we'd all usually prefer "you can't change it" over "you can change it by accident".)
Primordials are not a security feature per se, but they do provide some protection against some classes of security issues, so the security benefit isn't nothing.
The documentation around how, why, and when to use them, and the situations where those rules can be bent or broken, are subtle and few people fully understand them (especially new contributors). Eg, when it says they shouldn't be used on "hot paths", or that sometimes reliability is more important than performance, ok, how is that determined? How hot is "hot"?

I don't know of any alternative way to achieve the goals, but there are things that could maybe be explored to reduce the costs, which I've heard suggested before:

Develop a transpiler that turns "normal" JS into primordial-ized JS with a build tool. No small task, but something that could be done entirely on node's side.
Work with the V8 team (or even TC39) to develop an API or language feature to defend against intrinsic object mutation. Many more stakeholders, unbounded time to delivery.
Increase coverage of benchmark testing, establish objective documented criteria for exactly how hot-path something must be to justify using primordials, and enforce this with CI tooling, so at least primordial usage is clear and minimized as much as possible.
Accept that if someone breaks their platform, then their platform will be broken, and just stop using primordials. This doesn't provide the value that primordials provides, but omg, so cheap and easy, zero maintenance at all.

Primordials are a low-value local-maximum solution; just good enough to keep better solutions from being explored, but not actually very good. None of the better solutions are particularly easy or satisfying, either, so the moat around it is deep.

ljharb commented 1 year ago

I'd also like to note that I have been pursuing multiple proposals, over years, and with years to come, with the goal of making primordial-esque safety much more ergonomic for JS developers - and I've used (and will be using) node's primordials (and their UX difficulties) as a big part of the motivation.

In other words, I don't think this primordials approach is going to be necessary in node forever, but I think it should remain until there's a better option.

aduh95 commented 1 year ago

@isaacs I mostly agree with your summary and think it represents the opinion of the majority fairly. If I may, I'd like to note a few things regarding your suggestions:

Develop a transpiler that turns "normal" JS into primordial-ized JS with a build tool. No small task, but something that could be done entirely on node's side.

I don't think it's achievable, otherwise we would have done it. It might possible with some stricter variant of TypeScript, but inferring from "normal" JS what primordials to use is not possible.

Work with the V8 team (or even TC39) to develop an API or language feature to defend against intrinsic object mutation. Many more stakeholders, unbounded time to delivery.

Regarding TC39, if you read through https://github.com/tc39/proposal-call-this/issues/8 it seems they won't be interesting in offering new syntax for a problem that's deemed too niche. Reaching out to V8 to improve the performance of primordials sounds good, if there's an issue open on their issue tracker; I don't know if they would be interested in developing and maintain an API if they don't have use for it – it would be pretty cool to have though.

Increase coverage of benchmark testing, establish objective documented criteria for exactly how hot-path something must be to justify using primordials, and enforce this with CI tooling, so at least primordial usage is clear and minimized as much as possible.

It seems a bit contradictory: either we want primordials usage maximized as much as possible, and we use benchmark CIs to detect when it's not a good idea; either we want it minimized as much as possible, and benchmarks are not very important. But in any case, I'm all for clarification.

GeoffreyBooth commented 1 year ago

It seems a bit contradictory: either we want primordials usage maximized as much as possible, and we use benchmark CIs to detect when it’s not a good idea; either we want it minimized as much as possible, and benchmarks are not very important.

Either primordials are necessary for security or they’re not. If there are certain parts of Node where we’re like, “nah, don’t bother with primordials here because this code is too important,” then why use them anywhere? Only some parts of Node are worth protecting from intrusion, and not others?

anonrig commented 1 year ago

I agree with @GeoffreyBooth 100%.

ljharb commented 1 year ago

I totally agree that all parts should be covered by primordials.

Qard commented 1 year ago

It's worth considering that the cost of primordials is uneven.

A lot of the perf impact is in the prototype methods that get wrapped in a closure so it can look like a regular function that shifts it's "this" value into the first parameter position. Would the perf be better if prototype primordials instead just were the original function and had to be called with ReflectApply in-place?

The other one is the Safe* types--do we know the perf difference between building those versus just using the primordial functions on the plain types? I suppose with those types we want the objects to remain tamper proof after handing them to user code? Can we achieve that correctly by freezing the objects we expose instead?

Linkgoron commented 1 year ago

Either primordials are necessary for security or they’re not. If there are certain parts of Node where we’re like, “nah, don’t bother with primordials here because this code is too important,” then why use them anywhere? Only some parts of Node are worth protecting from intrusion, and not others?

Yes, of course some parts are more critical or sensitive than others. You might want to protect something like EventTarget, or process.exit or default constructors, but other performance-critical parts would be better off being more performant or readable instead. The project trusts the expertise of the team members to weigh the pros/cons in a lot of situations. Trade-off are used in a lot of parts of code (in general), and primordials (which I very much dislike. Their pros are IMO overblown and cons downplayed) are no different.

ronag commented 1 year ago

Just my two cents. While both performance and avoiding prototype pollution are important. I think contributor accessibility and ergonomics are very important. People work on Node mostly for free because they find it fun and interesting. If we don't value that we will turn away developers and stagnate. That being said, I'm sure that there are developers working on Node because they think primordials is an important and interesting topic. But I'm not sure that's where the majority lies.

While we haven't yet done a post-mortem on what motivated Bun over working on improving Node, I personally believe that If we don't keep Node interesting and accessible for contributors we will end up losing valuable resources and work to alternatives like Bun, Deno etc...

mcollina commented 1 year ago

Summarizing:

Some primordials (as we have implemented them), cause a significant addition in overhead, making a 100% coverage too costly in terms of performance.
Even a 100% coverage hardening would not allow us to give more guarantees inside our threat model, because the user code (including dependencies) could still be vulnerable.
Very few collaborators like using primordials.

RaisinTen commented 1 year ago

Practically, I think it is unlikely that we will be able to remove all primordials. It might be better if we solve this problem on a case-by-case basis because there are places where it makes sense to use primordials and there are places where it doesn't.

If we find a part of the code that is slow because of primordials, I would suggest removing uses of primordials from there and present the benchmark results. If the reviewers are convinced that the performance improvement is worth more than the robustness provided by the primordials, the change is going to land.

I agree that the usage of primordials does make it hard for contributors but I don't think it's something that would prevent folks from contributing. Anyone who is not familiar with primordials can submit a PR that doesn't use primordials at all and collaborators can guide them to use primordials appropriately.

targos commented 1 year ago

I agree that the usage of primordials does make it hard for contributors but I don't think it's something that would prevent folks from contributing. Anyone who is not familiar with primordials can submit a PR that doesn't use primordials at all and collaborators can guide them to use primordials appropriately.

+1 to that. From my PoV, our C++ code is much more difficult to grasp for new contributors (and for me as well), and I would never suggest to remove it.

gireeshpunathil commented 1 year ago

should we take a step back and assert:

is primordial for security or not
if not, what it is for
if it is for security, do we prioritise it over new contributor entry barrier
if it is for something else, do we prioritise it over entry barrier

Linkgoron commented 1 year ago

I agree that the usage of primordials does make it hard for contributors but I don't think it's something that would prevent folks from contributing. Anyone who is not familiar with primordials can submit a PR that doesn't use primordials at all and collaborators can guide them to use primordials appropriately.

+1 to that. From my PoV, our C++ code is much more difficult to grasp for new contributors (and for me as well), and I would never suggest to remove it.

I don't think that primordials are hard to grasp, as much as they make writing code very different than what you would normally write for the same functionality. The code just doesn't really look like JavaScript at all, and is much more procedural. In the following example(s), we can also see that primordials don't replace "everything" (e.g. isDirectory), and IMO it's also unclear when something is "just" a method and when it is a "primordial" (e.g. gracefulreaddir). Syntax highlighting is, arguably, better without them as well. I will say, though, that static methods are not as bad as class/prototype methods.

example 1

example 2

Practically, I think it is unlikely that we will be able to remove all primordials. It might be better if we solve this problem on a case-by-case basis because there are places where it makes sense to use primordials and there are places where it doesn't.

I think that this is the common consensus, at least.

I agree that the usage of primordials does make it hard for contributors but I don't think it's something that would prevent folks from contributing. Anyone who is not familiar with primordials can submit a PR that doesn't use primordials at all and collaborators can guide them to use primordials appropriately.

IMO the code looks much less familiar as a new person that reads the code and it's harder to reason about the code that you're reading. In addition I think that back and forth churn from someone thinking that they "solved" the issue, and then having to put in primordials in places would be a major block for a lot of follow-up contributions. This is, however, just a belief and I don't have any statistics to actually support this.

targos commented 1 year ago

In addition I think that back and forth churn from someone thinking that they "solved" the issue, and then having to put in primordials in places would be a major block for a lot of follow-up contributions. This is, however, just a belief and I don't have any statistics to actually support this.

To help with this, I would be in favor of stopping to ask contributors to refactor their code with primordials (except when the linter flags them, which I think is only for constructors/static methods at the moment). This can always be done later by collaborators.

ronag commented 1 year ago

Anyone who is not familiar with primordials can submit a PR that doesn't use primordials at all and collaborators can guide them to use primordials appropriately.

Existing code with primordial is difficult to read and understand.
Some people are not interested in putting in the effort to write primordial-based code even if they are guided.

It's not that nobody can work with primordials. It's that some don't want to.

I consider myself in this category. It makes the code unreadable and feels more like what I do at work than something fun and relaxing to do in my spare time.

joyeecheung commented 1 year ago

To provide some context, https://github.com/nodejs/node/issues/18795 was where it all began, and apparently we thought doing something like primordials was cool at the time and did not anticipate the effect it will have on the code base (I say this as the person who opened the pandora box and created the first primordials in the code base in https://github.com/nodejs/node/pull/25816... yes one day I was tired enough about primordials and I wanted to see who added it and...oh, it was me 👻 To be fair I think I was just not too fond of the repeated code throughout the code base to defend against user-land tampering and wanted to at least create a central lib for all the similar utils, but surely I wasn't hoping to see this code base to turn out like this when I did it).

IMO the performance concern is less of an issue, as mentioned in https://github.com/nodejs/node/issues/18795 even back in 2018 Turbofan can handle the kind of currying that primordials do well enough. I also inspected some generated graphs out of curiousity in the past and was (pleasantly) surprised by how well Turbofan optimizes many of the use cases of primordials. Given how Turbofan works of course it is possible that it may not optimize certain type of input well enough, but I don't think it's fair to label primordials as slower all together, it is more like a case-by-case thing from my previous research, and as usual I think performance claims should be backed up by benchmark numbers (preferably through not-so-micro-benchmarks, I remember in my research I had to outsmart Turbofan and make it actually generate the graph I wanted to inspect instead of just treating that minimal use of primordials as dead code ;))

My concern about primordials is mostly about readability. Like all code bases in the world Node.js has its own dialect of the languages it uses based on internal constructs and coding styles. I work on both the C++ land and JS land in Node.js and having done this for years I think I have at least got used to Node.js's C++ and JS dialects. I would say for someone who writes C++ regularly the C++ dialect of Node.js isn't actually that difficult to grasp after working with it for a while. But for somehow who writes JS regularly the JS dialect of Node.js is still quite difficult to digest - not impossible but it always seem very unnatural. Surely we don't use STL in Node.js's C++ all the time but when we don't, we use typed constructs that is very different from the STL and you can easily keep the them separate in your brain. In the JS land however the primordials are based on the actual standard library but then also have a very different style, and it is this weird place in the middle that makes them difficult for my brain to digest at least.

In terms of developer experience I would be happy to see them gone, or at least not enforced as heavily as we do now. Or as @targos suggested, we could stop forcing contributors to use them everywhere. I do not view primordials as a proper security mechanism, I see them as a way to enhance user experience (i.e. you don't want to delete Array.prototype.something and then blow up the REPL). Sure, not having 100% coverage means Node.js may still break under certain input, but as all user experience improvements it can always just be a best-effort thing.

And I'll re-iterate my opinion back in 2021 https://github.com/nodejs/TSC/issues/1104#issuecomment-996358282, I think a better alternative is to use C++ more internally. There are many factors affecting the barrier to contribute and the choice of language is just one of them but not all. And the argument of using JS lowering the contribution barrier is being compromised by the use of primordials as they make code difficult to read even for experienced JS developers. Then why not just try to do more (at least computations) in C++, where we just get robustness against user-land tampering + smaller binary size + hidden unwanted internals in stack traces + (sometimes) performance improvements + type checking, for free? I think all those combined already outweighs the barrier to get C++ contributions. I personally don't find getting JS reviews any easier than it is to get C++ reviews when I touch some niche part of the code base anyway.

ronag commented 1 year ago

Practically, I think it is unlikely that we will be able to remove all primordials. It might be better if we solve this problem on a case-by-case basis because there are places where it makes sense to use primordials and there are places where it doesn't.

I mean practically it's not that difficult to remove primordials unless I'm missing something. I do agree that practically it's unlikely we will get a consensus to do it. IMO. This it more of a political issue than a practical coding issue.

JakobJingleheimer commented 1 year ago

Anyone who is not familiar with primordials can submit a PR that doesn't use primordials at all and collaborators can guide them to use primordials appropriately.

For sure. But from the contributor's perspective (or at least my own experience in days of yore and what I've sensed from the couple recent new esm contribs), this is ruul annoying both because it's the unexpected hill at the end of a race and because it adds a pretty non-trivial cognitive load of gotchas as well as creates duplexity of day-to-day coding vs arcane node coding (can't use for…of, etc). The seemingly nebulous (to newcomers and even some veterans) criteria for when and how primordials et al are needed (and even just what is available) imposes some impostor syndrome, which exacerbates how intimidating contributing to a huge project like node.js is for newcomers.

I would argue that the poor DX primordials imposes is a more significant concern than performance.

targos commented 1 year ago

Then why not just try to do more (at least computations) in C++, where we just get robustness against user-land tampering + smaller binary size + hidden unwanted internals in stack traces + (sometimes) performance improvements + type checking, for free?

To me, the big advantage of internal JS implementations is that when there is a bug in the runtime, it's much easier to track down and debug:

Bug reports usually come with a more meaningful stack trace, instead of only the last JS frame, or just a C++ crash, which requires back and forth with the user even to know where the problem happens
You can use Chromium devtools to debug internal JS code

Of course this is coming from a JS developer. I guess someone more familiar with C++ will say that C++ bugs are easier to debug for them 🤷🏻

Qard commented 1 year ago

Anyone who is not familiar with primordials can submit a PR that doesn't use primordials at all and collaborators can guide them to use primordials appropriately.

Another interesting point: we could just be not so strict about it and let contributors make PRs without them. Then, as we deem it valuable to have them, we can make low-hanging-fruit issues to add them to specific places or use those as tasks for our various code and learn type events. 🙂

joyeecheung commented 1 year ago

Bug reports usually come with a more meaningful stack trace, instead of only the last JS frame, or just a C++ crash, which requires back and forth with the user even to know where the problem happens

I do not think that's always true - look at our reliability reports, for example (e.g. check out pummel/test-heapdump-shadow-realm from https://github.com/nodejs/reliability/issues/658), normally when the process crashes in C++ the signal handler should unwind the stack and print the stack trace to stderr. And because we compile with -Os by default we have meaningful stack trace for C++ in release builds too. The reports that contain only addresses in the stack trace are likely to be compiled by a third-party with different options or from a system that is configured to make the signal handler not work as well. By default crashes in C++ should result in a meaningful stack trace in most systems. We could also print the JS stack trace too if necessary (node-report does that when it's enabled).

You can use Chromium devtools to debug internal JS code

I am not sure if that's a pro for JS or not, there are also a wonderful list of tools one can use to debug C++ too, including powerful ones like rr that supports record and replays, and there are no equivalents for V8's JS (not in active development anyway) AFAIK.

isaacs commented 1 year ago

I think it's clear that "perf vs security" isn't the relevant question here. The perf cost is rarely relevant with modern v8, and the security benefit is incomplete and thus mostly irrelevant.

The issue is mutated intrinsics breaking the platform, vs the loss of contributions.

There's a lot of talk about "some contributors" here, so let me just state it plainly with a real example: I don't contribute much to node core almost entirely because primordials are a pain to work with. And I'm definitely not a "new" contributor.

aduh95 commented 1 year ago

Practically, I think it is unlikely that we will be able to remove all primordials. It might be better if we solve this problem on a case-by-case basis because there are places where it makes sense to use primordials and there are places where it doesn't.

It was also the conclusion of the discussion last time we had it. We never followed up on the vote, but the idea was to try to define what core modules were not necessary to be covered with primordials. The issue is that it's going to be very hard to come up with a list that will have consensus, and also all builtin modules rely on one another, so if decide primordials must be enforce for module A, it makes it hard to remove them in module B unless we can sure that module A will never depend on it.

There's a lot of talk about "some contributors" here, so let me just state it plainly with a real example: I don't contribute much to node core almost entirely because primordials are a pain to work with. And I'm definitely not a "new" contributor.

OK thanks for sharing. I wonder if we could make some kind of statistical analytics of the contributions on the Node.js codebase before and after the switch to primordials, to see if there's a measurable trend.

anonrig commented 1 year ago

Is the energy we spend on discussing, looking for regressions, implementing, and expecting from new collaborators to learn primordials really worth having them?

"Death by a thousand cut" - primordials is a part of it. V8 can optimize certain primordials right now correctly right now. This doesn't mean that it will continue to do that with the next major release, and we don't have the resources & benchmarks to validate it on every release.

If we do want to have the security of primordials, I think we should implement those functions in C++, not keep primordials.

aduh95 commented 1 year ago

If we do want to have the security of primordials, I think we should implement those functions in C++, not keep primordials.

That's easier said than done, it's not something the TSC can decide, it can only happen if contributions are made to make it happen. Primordials were introduced in the codebase by contributors, not by a TSC decision; similarly, they can be removed by contributors. It's not up to the TSC – unless Node.js collaborators cannot get to a consensus when reviewing a PR, only then a TSC vote can resolve the situation. That being said, I'm not convinced moving to C++ will lower the bar for contribution. It would definitely do a better job than primordials at protecting Node.js internals against global mutations.

This doesn't mean that it will continue to do that with the next major release, and we don't have the resources & benchmarks to validate it on every release.

That could be true for anything, that doesn't sound like a good argument.

GeoffreyBooth commented 1 year ago

We seem to have consensus that it’s harder to work in primordials than not. At least most people feel that way, and while some may say primordials make no difference I doubt anyone would argue that primordials make contributing easier. So it’s just a question of whether the cost to developer productivity (and indirectly, to contributions generally) is worth the benefit.

In my view, if we don’t use primordials everywhere—and it seems like we don’t, or can’t—then we’re not getting any security benefit. A fence with a hole in it isn’t much better than no fence at all. So I don’t see what benefit we’re getting out of primordials. If we thought that there was a potential future where we could get to 100% primordials usage—a fence with no holes—then at least there’s a decision to be made as to whether the security benefit is worth the productivity cost. But if that’s not on the horizon, then it seems like we’re paying a big cost and getting nothing in return.

aduh95 commented 1 year ago

@GeoffreyBooth maybe you missed it, but as it was said several time in this thread, primordials are not a security feature. If Node.js code were using 100% primordials, there would still be holes, because that's how the ECMAScript spec is written. Primordials are a tool, sometimes useful, sometimes not. “if we can't have it everywhere, let's have it nowhere“ is not a good take, the reality is more nuanced than that.

GeoffreyBooth commented 1 year ago

@GeoffreyBooth maybe you missed it, but as it was said several time in this thread, primordials are not a security feature.

I mean, their purpose is to prevent user code from overriding prototype methods and gaining access to Node internals, right? For what reason would we do that other than security?

Or put another way, what benefit are we getting out of primordials? Regardless of whether it’s worth the cost; what is the gain?

aduh95 commented 1 year ago

I recommend reading Joyee's comment to get context and informed critic of primordials: https://github.com/nodejs/TSC/issues/1438#issuecomment-1715314182

If you want a TL;DR, the short answer to your question is

I do not view primordials as a proper security mechanism, I see them as a way to enhance user experience (i.e. you don't want to delete Array.prototype.something and then blow up the REPL). Sure, not having 100% coverage means Node.js may still break under certain input, but as all user experience improvements it can always just be a best-effort thing.

GeoffreyBooth commented 1 year ago

you don’t want to delete Array.prototype.something

I don’t really see “As a user, I want to be able to delete methods from prototypes and still have everything function correctly” as a use case that we need to support. This feels like an extremely minimal gain that few users have ever benefited from; was there an issue opened before primordials were added where anyone complained about Node’s pre-primordials behavior in this regard?

Even if there are a few users out there who appreciate this new behavior, I think the cost of continuing to use primordials is much too costly to us in terms of time of effort and decreased contributions to justify maintaining that UX.

joyeecheung commented 1 year ago

was there an issue opened before primordials were added where anyone complained about Node’s pre-primordials behavior in this regard?

I don't think there's an easy keyword you can use but https://github.com/nodejs/node/issues/18795 can give you some clues e.g. https://github.com/nodejs/node/issues/12956 , which leads to a pretty convincing point (at least for me): if I can do it in Chrome, why can't I do it in Node.js? Is it because Node.js is not as robust as Chrome? (I guess it's okay to just give up and say, "yeah we are not as robust as Chrome", but just to give some history about why this thing exist).

benjamingr commented 1 year ago

@GeoffreyBooth here is a robustness argument:

Take the promise unhandled rejection error handling code for example.
Say the user used a library that incorrectly polyfilled Error.prototype.stack, let's say to add more content there
The implementation has an edge case that throws an error when populating that property.
The user runs into the edge case, a promise is rejected and ends up in the "possibly unhandled rejections" list.
Without primordials in the error path, if the error path happens to create an error and that throws the user gets a super confusing error without a stack trace (and at the wrong time, preventing the user from handling the error!)
With them they get an error indicating where the actual failure was.

The example is contrived but I've seen that in real code, debugging and error flows are places we should strongly favor developer experience over performance.

Additionally in web standards we literally have to use them in order to meet the web standard, we can decide to be non-compliant but I think without extreme justification that's a hard pill to swallow.

nodejs / TSC

Removing `primordials` from Node.js project #1438

History

Proposal

Pros

Cons

Suggestion