nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
105.18k stars 28.49k forks source link

Migration of core modules to primordials #30697

Open targos opened 4 years ago

targos commented 4 years ago

This is a tracking issue for the migration of core modules to use builtins from the primordials object.

At the moment, for performance-sensitive code, only static methods and global functions should be migrated. V8 8.0 might have the optimization that we need to migrate prototype methods.

aduh95 commented 4 years ago

Is there somewhere I can find out about primordials and why/how it is better than using global builtins? Maybe a recording of the meeting when this decision was taken? Is the primordials object documented somewhere?

amitmtrn commented 4 years ago

@aduh95 from my understanding it made in order to avoid user mutation in core libraries https://github.com/nodejs/node/blob/141a6e34eed05577edf43ad085a74f330d0559cb/lib/internal/per_context/primordials.js#L5-L7

BridgeAR commented 3 years ago

We discussed this a few times in a TSC meeting but so far there's no clear indication where we want to go with primordials as there are different opinions about them.

So far I struggle seeing a good argument using them. There are a couple of downsides though. We ran into multiple performance issues and we had to revert multiple PRs because of that. The code is significantly less readable and potentially hinder further contributions as people might struggle using them. The main reason I struggle is that I can't see a use case where it's really beneficial to use them. Node.js alone might change all it's code to use primordials and it won't crash anymore in case things get monkey patched in a way that is not supported but all other code would still have that problem. Pretty much all applications use a lot of external libraries and thus, external code that would not be protected by primordials. As long as that's not the case, there's pretty much no benefit by protecting Node.js alone. The only place that would really benefit from this is the REPL. But this is not a strong enough reason for me to change all of our code.

We also have to invest a lot of time to review and maintain this code and this is something we should also take into account. Thus, I suggest to take a step back and to stop changing more code to use primordials as long as we do not have a guarantee that there's a) no performance penalty, b) no decrease in code readability.

jasnell commented 3 years ago

I really haven't ever been bought in 100% on the primordials stuff. I get what it's trying to do, and I appreciate the work that's gone in on it -- and I really don't want anyone who has put that effort in to feel like it's been wasted at all. However, the issues @BridgeAR highlights are very real. Use of primordials does have a non-trivial impact on performance and it makes the code less readable. While it addresses one kind of potential security and stability issue, it leaves more critical ones on the table. I wish I had a great alternative solution, but I don't. All I know is that I really dislike having to use primordials right now.

targos commented 3 years ago

I also feel the same and it's why I personally never pushed for using primordials for prototype methods.

aduh95 commented 3 years ago

The main reason I struggle is that I can't see a use case where it's really beneficial to use them. Node.js alone might change all it's code to use primordials and it won't crash anymore in case things get monkey patched in a way that is not supported but all other code would still have that problem.

Contrary to "all other code", Node.js API provides APIs that should be temper-proof IMHO. For example, as a user I don't expect a console.log call to rely on mutable prototype method, and I might want to do something like:

const { bind } = Function.prototype;
Function.prototype.bind = function bind() {
  console.log(this, arguments);
  return Reflect.apply(bind, this, arguments);
}

That is currently not possible because our Console implementation happens to rely %Function.prototype%.bind and the process crashes with a stack overflow exception. If you try to use this code in a browser, everything still works and each .bind calls creates a log in the DevTool console.

I want to argue that some of the Node.js API is not like any other library code, Node.js is a JavaScript runtime, I don't think it is acceptable to crash the whole process when parsing perfectly valid ECMAScript. I feel quite strongly about that. I think we need to find what part of the API deserve to be temper-proof – e.g., maybe it's OK for http to rely on primordials, it's not for repl, inspector, console, worker_threads, etc.

The code is significantly less readable and potentially hinder further contributions as people might struggle using them.

I assume you mean switching from the "Java-like syntax" (<array>.map()) to the "PHP-like syntax" (array_map()). I wish we add the pipeline operator proposal:

ArrayPrototypeMap(
  ArrayPrototypeFilter(array, () => true),
  () => {}
);

array |> ArrayPrototypeFilter(() => true)
      |> ArrayPrototypeMap(() => {});

Unless we are ready to add a transpiler step to our build chain, I'm afraid this is out of the question until the proposal moves forward. My opinion on this the readability concern should come after robustness of core APIs, but I understand I may be alone on this one.

Performance issues are a valid concern that I share. I agree we need to do more benchmark runs before landing primordials related PRs, and we should prefer perf over stability.

Thus, I suggest to take a step back and to stop changing more code to use primordials as long as we do not have a guarantee that there's a) no performance penalty, b) no decrease in code readability.

I don't agree: if we stop working on primordials, we won't know when there's no performance penalty in using them. Also what kind of guarantee would you expect? How would you mesure an increase in code readability?

joyeecheung commented 3 years ago

I now think enforcing the use of primordials with a lint rule went a bit too far - the more I interact with code now full of primordials, the more I feel that many of the primordial usage aren't really necessary, probably because I primarily deal with bootstrap code these days and most of them are run before the user has a chance to monkey patch anything.

tniessen commented 3 years ago

Use of primordials does have a non-trivial impact on performance and it makes the code less readable. While it addresses one kind of potential security and stability issue, it leaves more critical ones on the table. I wish I had a great alternative solution, but I don't. All I know is that I really dislike having to use primordials right now.

+1. It's happened more than once that I had to update PRs just to switch to primordials, and not only is this additional work, it also does make the code much less readable.

I believe that most new contributors start working on Node.js because most of it is written in JavaScript, the same language that Node.js users rely on. Primordials are causing larger differences between JavaScript within Node.js core and JavaScript outside of Node.js core, and this is a burden not only for maintainers, but also for new contributors.

I'd still love to see how other V8 consumers do this. I believe that Chromium implements most of their APIs in C++, but maybe their team does have a different method.

tniessen commented 3 years ago

Also, I am not sure what the scope for primordials is. For example, should this be prevented by core?

const path = require('path');

Object.prototype.base = 'important file';

const file = path.format({ dir: 'C:\\Windows', name: 'foo', ext: 'txt' });

// Writing to file will overwrite 'C:\\Windows\\important file'.
Trott commented 3 years ago

This was discussed very briefly in the TSC meeting today. @nodejs/tsc Please review and comment!

joyeecheung commented 3 years ago

I think what we need now is to settle down on the criteria for usage of primordials, and write it down in the style guide. At a high level we could choose between:

  1. Use primordials by default, unless in these circumstances: performance-critical paths, etc.
  2. Write JS as usual by default, and only use primordials in these circumstances: in REPL, etc.

Would this be an actionable next-step for the discussion?

targos commented 3 years ago

Also, I am not sure what the scope for primordials is. For example, should this be prevented by core?

const path = require('path');

Object.prototype.base = 'important file';

const file = path.format({ dir: 'C:\\Windows', name: 'foo', ext: 'txt' });

// Writing to file will overwrite 'C:\\Windows\\important file'.

In my opinion, yes, this should be prevented. But this is not exactly related to primordials. It's more a question about impact of "prototype pollution" on core APIs. I think we made sure that child_process APIs are not affected by this for example.

aduh95 commented 3 years ago

Also, I am not sure what the scope for primordials is.

I agree that the scope is something we need to define, as currently it's quite fuzzy indeed.

For example, should this be prevented by core?

I think in this particular example, it would depend the code is from userland or from core:

FWIW, it seems to be the philosophy of the ECMAScript spec already:

Object.prototype.configurable = false;
Object.prototype.enumerable = false;
Object.prototype.writable = false;
Object.defineProperty(globalThis, 'myProperty', { value: 4 });

console.log(delete globalThis.myProperty); // false <=> myProperty is not configurable
console.log(Object.keys(globalThis).includes('myProperty')); // false <=> myProperty is not enumerable.
myProperty = 6; console.log(myProperty); // 4 <=> myProperty is not writable.

Another way of looking at it would be that, since the path core module is not part of any official spec, we get to decide what are its limitations, and we can say that one of them is: it expects JS built-ins not to be mutated from userland.

If I had to define which APIs "must" be temper-proof (so the minimal scope for primordials, if you will), it'd be:

For the rest of core, temper-proof code would be a nice-to-have IMHO, but it's fair enough if that's not something Node.js is interested to get to.

tniessen commented 3 years ago

We would typically use ObjectCreate(null) in this case

Since primordials are actively trying to allow users to modify built-in JavaScript objects, I suppose such changes would probably be semver-major.

More questions:

// In core (alternatively using ObjectCreate):
function coreFunction() { return { foo: 10 }; }

// User code:
Object.prototype.bar = 20;
const obj = coreFunction();
// Should obj.bar exist here?
// In core (alternatively using something less readable):
function coreFunction() { return () => 10; }

// User code:
Function.prototype.bar = 20;
const fn = coreFunction();
// Should coreFunction.bar exist here?
// Should fn.bar exist here?
// In core (alternatively using something less readable):
async function coreFunction() { return 10; }

// User code:
Promise.prototype.bar = 20;
const promise = coreFunction();
// Should promise.bar exist here?
aduh95 commented 3 years ago

Since primordials are actively trying to allow users to modify built-in JavaScript objects, I suppose such changes would probably be semver-major.

Not necessarily, we usually land those as semver patches currently. AFAICT Node.js doesn't pretend all the objects its core functions return is an instance of Object. If you let me correct you, I'd say primordials are trying to allow users to modify built-in JS objects without modifying the behavior of (some?) internal core components.

For all your questions, it depends if the return value of coreFunction is exposed to user-land at any point. IMHO:

ljharb commented 3 years ago

It depends on the intention. Yes, if i pass an options object and it gets inherited properties, it should pick up anything that’s been added to Object.prototype, because that’s the semantics of an object in JS. Any object returned from core would follow the same semantics, and should largely act as if user code constructed it (just like language and web behavior).

However, platform code, from the perspective of the user, is not JavaScript, and it should be impossible for me to cause it to break, or expose internal impl details, simply because i know it’s written in JS.

Trott commented 2 years ago

I'm going to (try to) set up a meeting for people to figure out reasonable paths forward on this stuff. I'll definitely include @aduh95 @bridgear @ChALkeR and @tniessen. If you want to be included as well, please email me. (My email is in my GitHub profile.) Thanks.

gireeshpunathil commented 2 years ago

we met yesterday to discuss this, and here is the summary. @BridgeAR @tniessen @ChALkeR @joyeecheung @aduh95 - pls add / correct if I missed or mis-represented anything:

gireeshpunathil commented 2 years ago

doodle for next week's sitting: pls express your availability: https://doodle.com/poll/xnzeg2d52xsf3h29

mcollina commented 2 years ago

I would like to add the incompatibility of Jest with our approach to primordials. Jest replaces all the globals for each test run. This results in node core errors failing a instanceof Error check.

I would like to add a one more option to the list: undo the migrations that are known to cause performance issues.

targos commented 2 years ago

Isn't the issue with Jest that they run the tests in a vm.Context ?

tniessen commented 2 years ago

I'd like to add to @gireeshpunathil's summary a few thoughts:

bmeck commented 2 years ago

@tniessen we had a lot of discussion on the incremental migration around monkey patching in the past. See things like https://docs.google.com/document/d/1h__FmXsEWRuNrzAV_l3Iw9i_z8fCXSokGfBiW8-nDNg/edit#heading=h.v9j0bjkph3tv

So it isn't just about if we want/don't want to allow monkey patching, but how do we even have a security model if no part of core can be considered safe/reliable/robust.

tniessen commented 2 years ago

So it isn't just about if we want/don't want to allow monkey patching, but how do we even have a security model if no part of core can be considered safe/reliable/robust.

I read that part of the document you referenced:

There should be a way for trusted resources to be loaded ahead of the main app and given the capability of mutating primordials, after which point primordials will be frozen and the main application will be loaded.

Based on this idea and your comment, it sounds like this is in support of using primordials everywhere in Node.js core? Please correct me if that's not accurate.

bmeck commented 2 years ago

@tniessen it doesn't have to be primordials, but some way of being able to trust node core not to be infected by 3rd party dependencies is the goal. primordials prevent infection from the JS runtime, but not from cross module linkage (like things that replace fs.readFile from lib/fs.js to manipulate module loading logic in lib/internal/modules/cjs/loader.js)

tniessen commented 2 years ago

it doesn't have to be primordials

@bmeck Oh, sure. I wish we did have an alternative, but so far, I don't think we have found one.

but not from cross module linkage

Right, I believe that's mirroring my previous comment:

What justification is there to "protect" Node.js against modifications of builtin prototypes, but not against modifications of Node.js core modules?

bmeck commented 2 years ago

What justification is there to "protect" Node.js against modifications of builtin prototypes, but not against modifications of Node.js core modules?

The document linked goes over this as a progressive step after the JS runtime vector is stemmed. We actually do this in various places I work on in core. See things like https://github.com/nodejs/node/pull/39513 , which would make entire features like https://nodejs.org/api/policy.html non-viable for ESM w/o the extra steps you need to do to contain the leakage in CJS (such as holding onto safe references [most CJS implementation stuff is using destructured safe references these days! yay]).

tniessen commented 2 years ago

as a progressive step after the JS runtime vector is stemmed

I am not against doing that at all. I read dozens of comments related to primordials this week and none of them mentioned in this part, I think. So it's good to know that the plan is for primordials to be extended beyond the JS runtime.

ExE-Boss commented 2 years ago

IMO, at the very least, static methods where the primordials contains a SameObject reference to the function should always use the primordials from a local const reference, e.g.: changing Reflect.apply and the like to:

const {
    ReflectApply,
} = primordials;

Because the former has to first resolve the mutable Reflect property from the global object, and then resolve the mutable apply property from the Reflect object, whereas the latter only has to resolve the ReflectApply const from the local declarative environment.

gireeshpunathil commented 2 years ago

the doodle converged into Aug 12th 4 PM UTC. I sent an invitation to everyone who signed up. If any one else want to join, pls send me a mail, and I will add you.

targos commented 2 years ago

We should probably talk about https://github.com/nodejs/node/pull/38635

tniessen commented 2 years ago

Thank you @targos, I was not even aware of that PR. It seems that it is mostly a guide to using primordials and does not seem to address the goals and intentions behind primordials beyond "makes the code more reliable."

gireeshpunathil commented 2 years ago

We had a 90 minutes meeting yesterday with a lot of fruitful discussions, but without convergence on a specific recommendation on what to do with primordials. However, we all agree that the scopping for primordials (what we think are important in terms of security, performance, robustness, readability) is a good next step, and then meet again to derive consesus.

Condensing the discussion into few discrete view points. Feel free to correct me wherever I might have mistaken / misrepresented statements.

(i) @ljharb et. al

(ii) @bmeck et. al

(iii) @BridgeAR et. al

(iv) @mcollina et. al

@ChALkeR has come up with a spreadsheet to help define the scope for primordials

https://docs.google.com/spreadsheets/d/1XuTUhXiMaEmeCQyznM1Nad_wCMKGyXLM3HCK08C2m9w/edit#gid=0

request all the interested parties to fill it up and contribute / improve. Also, let us continue the discussin in this issue, inline.

Scope for the next meeting:

ronag commented 2 years ago

project's entry barrier is raising - newcomers are less motivated to contribute due to complex construct and less of JS

I'd just like to emphasis this point since I feel it often doesn't get assigned the importance I believe it is due. This is one of those values that might not be important to current contributors (which are the ones that make decisions) and is therefore maybe less prioritized. However, people do move on and it is important for the health and future of the project that we have a steady influx of new contributors.

ronag commented 2 years ago

@gireeshpunathil Is anyone welcome to add to that spreadsheet or is it just intended for those that are more actively involved in these discussions?

gireeshpunathil commented 2 years ago

@ronag - pls go ahead!

ljharb commented 2 years ago

The other point I'd add to my list is, that "primordials" isn't the biggest barrier to contribution, and while those are incredibly important to address, there's far more impactful things to focus on than primordials, like overall coding style, and non-JS code.

bmeck commented 2 years ago

I think even if primordials aren't the most impactful, they are a barrier. I agree we could reduce other barriers by doing things like introducing a formatter for core, but primordials would still be a barrier since they are more than just formatting and cause a different dialect of JS to be used. Core does essentially write non-idiomatic JS and we shouldn't be trying to frame writing core in JS as an absolute win without downsides. Non-JS code in core doesn't have to have that barrier and can write code in other languages in a more idiomatic manner for example and completely remove some of that tension.

bmeck commented 2 years ago

I've made a few thoughts on this in a document and will leave it here if others wish to read it before any more meetings: Behavior Guarantees In Node

gireeshpunathil commented 2 years ago

to take this to a forward, I setup a doodle poll for a 3rd meeting, please poll. https://doodle.com/poll/i644hq3qsei87gw5

My expectation through this meeting is to be able to refine and potentially reduce the 4 arguments in https://github.com/nodejs/node/issues/30697#issuecomment-898211247 to may be 1 or even 2, in which case present those to the TSC subsequently for a vote.

gireeshpunathil commented 2 years ago

the only one slot that works for every one is Sep 13th 3 PM UTC. Going to schedule one for that time.

gireeshpunathil commented 2 years ago

I have invited all the previous participants to the meeting, apart from those who responded to the doodle - @mcollina @bmeck @targos @tniessen - pls see if you are able to make it.

mcollina commented 2 years ago

I'll do my beet to attend

gireeshpunathil commented 2 years ago

the meeting got disconnected at 40 mins. thanks everybody for the great discussion!

bmeck commented 2 years ago

We tried to talk about alternatives in the last meeting and I did a lot of digging. https://github.com/tc39/proposal-extensions/issues/11 might be of interest to an alternative to primordials if we can get a few tweaks / additional features. https://github.com/tc39/notes/blob/8711614630f631cb51dfb803caa087bedfc051a3/meetings/2017-11/nov-30.md#9iiia-block-params-to-stage-1 is also a much older form but seems to be less realistic / likely.

ljharb commented 2 years ago

There will be a newer proposal on the next agenda that would replace both of those, that indeed would work quite well for improving the ergonomics of invoking primordials.

js-choi commented 2 years ago

For reference, the proposal to which @ljharb refers to is js-choi/proposal-bind-this-operator. I do indeed plan to present it at the TC39 October plenary for Stage 1, as an alternative to both prior proposals. Feedback and examples regarding Node’s internal use cases are welcome in js-choi/proposal-bind-this-operator#5 and js-choi/proposal-bind-this-operator#7.

tniessen commented 2 years ago

Sorry, I wasn't able to attend the last meeting. Based on my understanding of these comments, these new proposals are in support of primordials and do not change the way they work but only change the syntax of using them. Is that correct?

ljharb commented 2 years ago

@tniessen correct - the bind-this operator would allow arr->ArrayPrototypeSlice(x, y), where ArrayPrototypeSlice is not bound (which is the source of most of the performance concerns), instead of ArrayPrototypeSlice(arr, x, y) where ArrayPrototypeSlice is bound.

gireeshpunathil commented 2 years ago

@ljharb , @bmeck , @BridgeAR - is it possible for you to attend the next TSC sitting (October 7th, 11AM EDT) and pitch in your viewpoints as well as answer any questions the team may have? We wish to arrive at an optimal decision on this, and your insights and assertions are greatly appreciated! Please let me know.