CITGM abuse - Githubissues

vkurchatkin commented 7 years ago

Hi everybody.

I'd like to bring up an issue of potential abuse of current CITGM process. As it is, it allows maintainers of included packages to abuse Node core and internal APIs without the risk of future breakages, thus strong-arming Node core to maintain compatibility with their code.

I don't think that there is a precedent here, but is seems that author of https://github.com/standard-things/esm is planning to do just that. His code contains usages of the following internal APIs (list is not comprehensive):

new Module
Module.wrap
Module._compile
Module._extensions
Module._resolveFilename
Module._cache
Module._nodeModulePaths
path._makeLong
process.binding
- fs
- getStatValues
- stat
- internalModuleReadFile
- internalModuleStat
- natives
- util
- arrow_message_private_symbol
- decorated_private_symbol
- decorateErrorStack
- setHiddenValue
process._tickCallback
process._preload_modules
REPLServer.prototype.createContext

While this is fine for a standalone obscure package that nobody uses, he intends to use in his other package, lodash, which is one of the most dependent upon packages on npm and also a part of CITGM testing. Furthermore, it seems that the only reason he wants to do this is to exploit CITGM and to force Node core to support these APIs indefinitely.

Here is a direct quote from here:

And from twitter:

All of this obviously being done in bad faith. That's not to mention that @jdalton has blocked me on Twitter immediately after that, preventing even a possibility of constructive discussion.

I'm not sure what should or can be done here, but I thought it was important to bring this issue to attention of CTC. We already had a crisis with graceful-fs and npm in the past, and this one has a potential to be much worse, so we should do everything possible to prevent it, until it's too late.

mcollina commented 7 years ago

I think we should work with module authors to provide the APIs they need. I think some of those cases can't be implemented without private properties usage. If those things needs to be part of the public API contract, I would recommend @jdalton and others to open issues on core instead of unilateral actions. We are working to solve these issues, let's work together.

If it happens on purpose, then I'm +1 in removing preeemptively any package that does that. CITGM is there to avoid breakage of things that are already in the wild, and it requires the author cooperation. If there is no author cooperation, then the module can be removed (even immediately).

The lodash community is massive, and I recommend everyone not to use it as a threat.

jdalton commented 7 years ago

Nothing is being done in bad faith.

Here's the full Twitter thread where I directed vkurchatkin to our issue tracker to help. To which he repeated his drive by Twitter comments and then put me on blast here 😕

I've released @std/esm with a looow pre-1.0 version number and forecasted my rough plans on purpose, to allow time for things to shake out. I'm also working with folks like @bmeck to ensure the @std/esm loader aligns with Node's ESM plans and doesn't step on toes.

The approach taken by @std/esm is similar to other packages like babel-register, reify, or flow-remove-types.

I think we should work with module authors to provide the APIs they need. I think some of those cases can't be implemented without private properties usage.

This would be fantastic!

If those things needs to be part of the public API contract, I would recommend @jdalton and others to open issues on core instead of unilateral actions. We are working to solve these issues, let's work together.

Yes, awesome!

If there is no author cooperation, then the module can be removed (even immediately).

I think I've proven over the last 5 years with Lodash, and other efforts, that I'm all for cooperation and working together to improve thing. If there's a more proper way to do something I'm up for it 😃

jasnell commented 7 years ago

This would be fantastic

In order to proceed down that path, we would:

A) a better idea of what new public APIs would be necessary

B) a commitment on the part of module developers to move towards using those new APIs rather than the internal private APIs.

refack commented 7 years ago

IMHO there are two separate issues here:

@std/esm
Usage of private APIs in popular modules.

RE (1) I've actually suggested to @bmeck that we should declare @std/esm as the transition tool to ESM for node versions that don't support them natively (maybe even package it like we do npm). So IMHO this is a special case.

RE (2) AFAIK there is a famous precedent, graceful-fs, which goes to show that CitGM or not, changing private APIs might have significant repercussions. IMHO CitGM is just a tool for us to make better educated decisions, and allows proactive collaboration with package maintainers.

A possible long term goal might be package certification, where the certifying body could also assess a package's exposure to private and experimental API.

vkurchatkin commented 7 years ago

IMHO there are two separate issues here:

Actually, my primary concern is the third one: CITGM abuse. CITGM gives package authors a certain amount of power over Node core. We need a mechanism for preventing abuse of this power and holding them accountable.

jdalton commented 7 years ago

CITGM gives package authors a certain amount of power over Node core. We need a mechanism for preventing abuse of this power and holding them accountable.

CITGM just gives a heads up to breaking swaths of the ecosystem and allows coordination before hand to reduce issues where possible. CITGM is powered by and for the ecosystem. It's not the whip of the ecosystem. Node's on the path to unexposing internals for its ESM branch and I think that's a good way forward to reducing its compat concerns. For CJS, I'm all for solidifying in some way the existing paths.

refack commented 7 years ago

CITGM gives package authors a certain amount of power over Node core. We need a mechanism for preventing abuse of this power and holding them accountable.

I think it's the popularity that gives them the power anyway. And since we follow the "lean core" paradigm IMHO that will continue to be. So as I see it cooperation is the best way to keep the ecosystem working smoothly. So I look at CitGM as the opposite, i.e. as a communication channel. It allows us to be proactive when we see a package break and communicate that to the author sooner rather than later. I agree that if the breakage is due to internal APIs use, responsibility for fixing should be pushed to the package. In that sense we do hold the authors accountable.

refack commented 7 years ago

We need a mechanism for preventing abuse of this power and holding them accountable.

IMHO the best way would be an extension to CitGM that audits and publishes quality matrixes for the packages (certification lite), where internal API use should be one of the parameters.

jdalton commented 7 years ago

@refack

RE (1) I've actually suggested to @bmeck that we should declare @std/esm as the transition tool to ESM for node versions that don't support them natively (maybe even package it like we do npm). So IMHO this is a special case.

I brought this up too (at your suggestion). The @std/esm loader has already proven useful in finding under-specified behavior in the ESM proposal and helping inform users of Node's intended ESM plans. There may be, at the very least, opportunities for cross blog posts on things like the differences between it and Node's native ESM support.

bnoordhuis commented 7 years ago

Nothing is being done in bad faith.

It is however a fool thing to do. The way e.g. https://github.com/standard-things/esm/blob/master/src/fs/read-file.js uses internalModuleReadFile() with the excuse of it being faster is just asking to get burned sooner or later - internals can and do change without notice, that's why they are internal.

You know the development process well enough to know that and since you intend to ram this through by having lodash take a dependency, I can see why @vkurchatkin thinks it's in bad faith. You are not an idiot so what are you then?

What do you expect to come from this? The path of least effort for us is to simply drop lodash from citgm and add a warning to the release notes saying "lodash is known to be incompatible with newer node.js versions, use library X instead."

So. What's your end game?

jdalton commented 7 years ago

@bnoordhuis

uses internalModuleReadFile() with the excuse of it being faster is just asking to get burned sooner or later - internals can and do change without notice, that's why they are internal.

That's why its use is wrapped in catches and fallbacks.

The path of least effort for us is to simply drop lodash from citgm and add a warning to the release notes saying "lodash is known to be incompatible with newer node.js versions, use library X instead."

Yikes. Lodash and @std/esm follow semver so folks can always use their version ranges to ensure bug fixes make their way to them.

So. What's your end game?

My end game is to have Node way ESM in Node 4+ so that I and others can use it. My effort also placates devs who would otherwise be hostile towards Node's ESM plans and allows them a route instead of say forking Node (which has been floated by).

jdalton commented 7 years ago

I would ask for a little patience and for folks to try not to project negative intent.

From my point of view I had a random dev troll me on twitter, claim to have solutions, and despite repeated requests to help, choose instead to put my character on blast. Then have another dev, whose had trouble with inclusivity in the past, take pot shots at my code and more than subtly threaten my projects.

I've tried to make this thing work while working with folks from Node land, the community, npm, tc39, etc. I have a strong history of supporting Node, its ecosystem, and its users. I'd ask for a little respect and credit here that I'm not some hostile entity.

Of course I'm up for submitting @std/esm to CITGM through proper channels, things are in a bit of chicken and egg state at the moment. However, I'm thankful for @vkurchatkin bringing the topic if that means we can work to solidify hooks and make things better.

Updated Crossed-out personal comments.

jasnell commented 7 years ago

I'd have to agree with @jdalton that the comments in this thread haven't been overly helpful. Let's please keep it more constructive.

vkurchatkin commented 7 years ago

From my point of view I had a random dev troll me on twitter, claim to have solutions, and despite repeated requests choose instead to put my character on blast. Then have another dev, whose had trouble with inclusivity in the past, take pot shots at my code and more than subtly threaten my projects.

Sorry, but this comment just shows one more time how hostile you are.

I never claimed "to have solutions", I've expressed my belief that solutions exist. Even if I had, I don't owe you them. It's your responsibility to find them.

However, I'm thankful for @vkurchatkin bringing the topic if that means we can work to solidify hooks and make things better.

I don't believe that is the case. After all, blocking me in twitter doesn't seem to be a sign of gratitude, but rather a sign of acting in bad faith.

That said, I think we should focus on discussing general policy, not this particular situation.

bnoordhuis commented 7 years ago

@jdalton I was giving you the benefit of the doubt but your last comment leaves me no choice but to assume that you are simply looking for a fight. I don't think it's in our best interest for us to engage with you anymore.

jdalton commented 7 years ago

@vkurchatkin @bnoordhuis I'm sorry if I've upset either of you. I don't want beefs and don't think they're appropriate or helpful.

I'm up for working with those willing: @jasnell, @refack, @bmeck, @MylesBorins, etc. ❤️

MylesBorins commented 7 years ago

This thread doesn't really have anything actionable imho

Decisions regarding releases are made on a case by case basis by the release team

Citgm results are analyzed, and when appropriate modules are removed

We have never had to stop a release for a broken module, we have delayed a release because that breakage showed serious ecosystem damage.

I'd like to suggest we close this thread. The esm discussion is important, but not entirely relevant based on the initial request. Have faith that the release team will make informed decisions. If we have concerns that certain modules are heading in a bad direction we should work with the maintainers to avoid that from happening.

Edit: sorry for closing, not my intent

gr2m commented 7 years ago

Vladimir very nicely laid out a list of internal modules that @std/esm is using currently. I would suggest we complete this list and then give John-David and others time to comment each one of these.

Any suggestions on how we can split up the seperate private API used by @std/esm into separate discussions and work them off one-by-one? I feel everyone wants to come to a consensus, but please give people some time :)

jdalton commented 7 years ago

@gr2m

I would suggest we complete this list and then give John-David and others time to comment each one of these.

I've replied the initial list here.

Any suggestions on how we can split up the seperate private API used by @std/esm into separate discussions and work them off one-by-one? I feel everyone wants to come to a consensus, but please give people some time :)

I'm up for this too but maybe it's more appropriate for a separate issue than this one. That way we can be on more even footing and not coming from a defensive place.

mcollina commented 7 years ago

Given history, I think we should encourage module authors (and I put myself in this group) to file feature requests rather than using internals. As a group, we need fo listen and maybe implement those features, or provide a way to do the same thing.

On the other end, I do not think lodash is the right place to do this level of experimentation because of the major exposure. However, it is not my module, we are free to disagree and I will support you anyhow in implementing the needed APIs.

jdalton commented 7 years ago

@mcollina

I will support you anyhow in implementing the needed APIs.

Thank you! At the moment the @std/esm package looks to work for Node 4, 5, 6, 7, 8, 9-nightly, 9-ESM-PR, and NodeChakra flavors. @bmeck is planning to do another round of ESM loader behavior reviews in a week or so and I'm wrapping up fixes and unit tests from his initial review. I'd love the possibility of switching to more proper APIs for Node versions that support them!

bnoordhuis commented 7 years ago

I'm sorry if I've upset either of you.

@jdalton Is that a passive aggressive "screw you" or a genuine apology? Hard to tell with text.

jdalton commented 7 years ago

@bnoordhuis I genuinely don't want you upset and would really appreciate a thorough and thoughtful code review, or anything you can contribute (test cases, various gotchas, tips, etc.) in the appropriate channel.

bnoordhuis commented 7 years ago

@jdalton Happy to hear that. I would like you to apologize explicitly for your inappropriate comment, though. I harbor no hard feelings to anyone who can admit to their mistakes.

jdalton commented 7 years ago

@bnoordhuis I'm sorry if my comment crossed a line. It was unnecessary. It was my intent to show the situation from my point of view, ask for patience, and deescalate tensions. Thank you for being open to continued discussion.

bnoordhuis commented 7 years ago

@jdalton Thank you, I appreciate that.

refack commented 7 years ago

I totally agree with @MylesBorins:

Citgm results are analyzed, and when appropriate modules are removed

We have never had to stop a release for a broken module, we have delayed a release because that breakage showed serious ecosystem damage.

And in this case I truly believe there has been no bad faith from @jdalton's side, only a misunderstanding.

ChALkeR commented 7 years ago

@jdalton Perhaps this is not the correct place, but my opinion is that making lodash (or anything else with significant impact) depend on those undocumented APIs mentioned here would be very harmful.

What would happen when someone else uses the same undocumented APIs?

E.g someones badly written business logic in the top end application? What would happen if monkey-patching those would suddenly hit those setups through a chain of dependencies?

Simple example — babel-node and others like that (yes, that is also not a good thing, but at least it stays at the very top and does not suddenly get into your deps).
Or, even worse — another (theoretical) esm fallback module. What would happen, if app uses A and B, A uses C, C uses lodash, lodash uses «@std/esm», and B uses «@example/esm-fallback», and those conflict?
A simpler question, but even more likely — what would happen if different versions of lodash would be loaded through dependency chains and those use different versions of «@std/esm»? Would those always be compatible between implementations?

There were noticeable conflicts causes by different versions of a popular module which simply monkey-patches String already in dependency chains. And this is touching a lot more of undocumented stuff and monkey-patches those.

Monkey-patching and touching a lot of private APIs is going to violate a whole lot of stability gurantees, and CITGTM won't even solve it for you.

What can be discussed here is which of those APIs do you want to be made public, documented, and made safe to be used. I doubt that the whole set for @std/esm to work would be possible, though.

ChALkeR commented 7 years ago

@jasnell, @nodejs/ctc Btw, this is a pretty good list of private APIs which should not be used. I think that we should investigate which of those are not widely used yet and runtime-deprecate those asap, before someone else tries to write another esm fallback on top of those and things like that hit popular npm packages, risking to break the ecosystem.

I will do the investigation part.

MylesBorins commented 7 years ago

Can we close this issue and open a new one specifically about the apis?

On Aug 20, 2017 2:07 PM, "Сковорода Никита Андреевич" < notifications@github.com> wrote:

@jasnell https://github.com/jasnell, @nodejs/ctc https://github.com/orgs/nodejs/teams/ctc Btw, this is a pretty good list of private APIs which should not be used. I think that we should investigate which of those are not widely used yet and runtime-deprecate those asap, before someone else tries to write another esm fallback on top of those and things like that hit popular npm packages, risking to break the ecosystem.

I will do the investigation part.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nodejs/CTC/issues/164#issuecomment-323612038, or mute the thread https://github.com/notifications/unsubscribe-auth/AAecV7NVNd7mRf1VYOTS0uAHTNjH26OYks5saJ_4gaJpZM4O8phC .

ChALkeR commented 7 years ago

@MylesBorins +1 for a new one about the APIs, but as for closing this one — is the original issue about CIGTM misuse completely resolved?

jdalton commented 7 years ago

@ChALkeR

Perhaps this is not the correct place, but my opinion is that making lodash (or anything else with significant impact) depend on those undocumented APIs mentioned here would be very harmful.

The cat is already out of the bag in terms of CJS pseudo private API use. Many projects use them. For example, over the last 13 months Meteor has had 10s of thousands of apps relying on a similar approach for Node. That said, efforts are being made to tighten up ESM branches. I've worked to avoid stepping on toes, support versioning, opt-ins, etc.

FWIW I really dig the proposel of making some of the APIs more consumer friendly to reduce risks (https://github.com/nodejs/CTC/issues/164#issuecomment-323596902 and https://github.com/nodejs/CTC/issues/164#issuecomment-323608445).

ljharb commented 7 years ago

If depending on undocumented APIs is harmful, t seems prudent to strive to remove them - the idea that accessible things can be considered "private", whether documented or undocumented, is how we get into these situations.

ChALkeR commented 7 years ago

The cat is already out of the bag in terms of CJS pseudo private API use. Many projects use them. For example, over the last 13 months Meteor has had 10s of thousands of apps relying on a similar approach in Node.

This is a classic example of red herring, btw. Please, do not make it look like an argument.

Thanks for hinting the Meteor issue, though. Also, that increases the probability of things getting broken with @std/esm — the more things monkey-patch Module (and do other bad things like that), the higher are the chances that those will end up breaking one another in the end users code.

jdalton commented 7 years ago

@ChALkeR

Thanks for hinting the Meteor issue, though. Also, that increases the probability of things getting broken with @std/esm

@benjamn and I are actually working together since our projects are so similar. I'm super open to working with other projects, as well, to iron out compat issues if/when they come.

This is a classic example of red herring, btw. Please, do not make it look like an argument.

Oh boy, apologies. I'm not gonna get into that tit for tat debate style of communication. If you dig it, no worries, it's just not for me.

@ljharb I think having a loader like @std/esm help with the transition to the Node way of ESM is pretty great. While using pseudo private APIs isn't ideal, it does work for Node 4-9. If the concern is with Node 10 and beyond then a transition to more official or stable APIs and approaches can totally be done.

ljharb commented 7 years ago

@jdalton I totally agree! By "situation" I meant having concerns about breakage - my implication is that the existence of gray areas that are both accessible but not intentionally public is the problem. That their existence allows for the esm loader to exist is nice, but doesn't justify imo continuing to support that gray area going forward.

mcollina commented 7 years ago

@jdalton why do you want to change a fantastic no-dependency module like lodash to include a module that monkey-patches the module system of the Node.js runtime? This is part of the reason this thread become heated, and what I still do not understand.

The esm module is actually a nice idea for applications, and I think we should recommend it if anyone wants to try modules. Still, it monkey patches internals, and after it's loaded the behavior of Node.js changes.

bnoordhuis commented 7 years ago

As a cautionary tale vis-a-vis this comment:

That's why its use is wrapped in catches and fallbacks.

That misses the point. See https://github.com/faye/websocket-driver-node/issues/21 for an example where using node.js internals went fatally wrong in a very obscure manner.

You don't want to unleash that kind of data corrupting bugs on your unwitting users.

jdalton commented 7 years ago

@ljharb

That their existence allows for the esm loader to exist is nice, but doesn't justify imo continuing to support that gray area going forward.

Ya, I don't think anyone wants to muck with pseudo internals. It's not my first choice for sure. If there was a less brittle way to support things like babel-register or @std/esm that would be great!

@mcollina

why do you want to change a fantastic no-dependency module like lodash to include a module that monkey-patches the module system of the Node.js runtime

I'd like to start using ESM in a Node-way for Node 4+ in Lodash.

@bnoordhuis

You don't want to unleash that kind of data corrupting bugs on your unwitting users.

I'm open to suggestions or better ways to tackle things. This issue thread is probably not the place though. For suggestions or improvements please head over to https://github.com/standard-things/esm/issues/66 or feel free to open another issue/PR there.

ChALkeR commented 7 years ago

Oh boy, apologies. I'm not gonna get into that tit for tat debate style of communication. If you dig it, no worries, it's just not for me.

@jdalton Sorry, I did not want to make that sound harsh, also, that wasn't personal. Let me rephrase.

What I wanted to say is that while that information about Meteor is relevant to the issue being discussed, other modules doing bad things in no way justify making new and existing widely-used modules doing more bad things which are more likely to get broken and break the ecosystem.

I was (and still am) under an impression that you were using that example to justify @std/esm and lodash using a lot of private APIs and monkey-patching, and in my opinion the Meteor example does not justify that even by a slightest bit.

jdalton commented 7 years ago

@ChALkeR

Sorry, I did not want to make that sound harsh, also, that wasn't personal.

It's cool! I'm OK continuing to feel things out. I understand there's some risks but if Node 4-9 is doable and we have the start of making the situation better in Node 10+ then that's a win worth pursuing IMO.

Should next steps be for those interested to head over to @std/esm to audit / review it? That would get the ball rolling on making improvements, getting off pseudo private APIs where possible, and finding patterns that may be codified for more consumer friendly, less brittle tie-ins in future Node versions.

bnoordhuis commented 7 years ago

I think it would be easier if you filed issues where you outline why and how you use internal function X. The discussion on if and how to make something public will have to happen here anyhow, no use in splitting that across two issue trackers.

jdalton commented 7 years ago

@bnoordhuis

I think it would be easier if you filed issues where you outline why and how you use internal function X.

Cool, will do! I know Node core tends to stay out of user-land packages, letting them rise or fall on their own. However, for those concerned I'd appreciate any help you can provide on the repo side. (audit, review, test cases, node rules, etc.)

TBH I'm super exhausted from this ordeal (not how I planned to spend my Sunday). For me this thread has run its course. If you want to discuss pseudo private API use in @std/esm or help remove or improve its use then go to this thread. If you want to talk about CITGM stuff I'd ask that you start a new thread as this one has substantially derailed into other territory ❤️

vkurchatkin commented 7 years ago

@jdalton just one more thing: you keep repeating "pseudo private API", as if it justifies its usage, because it's not truly private. It doesn't. In fact, there is no such thing, it's just "private API".

jdalton commented 7 years ago

@vkurchatkin Noted. Whatever you want to call it. If you're up for helping that rocks!

jdalton commented 7 years ago

I'm unsubscribing from the issue now. Please feel free to ping me in email, or repo issues where needed. Thank you!

Trott commented 7 years ago

I could be wrong about this, but I always kinda thought there was still a decent-sized faction in Node.js governance that believed we should take the "if it's exposed, it's public API" approach. I know some of the more vocal proponents of that (@isaacs, @chrisdickinson) are no longer involved in Node.js governance, but surely there are still folks on the CTC who think that should be the default, no?

It's entirely possible that even so, these particular APIs are exceptions to that approach. But I would be surprised if I was the only person on CTC who felt that "if it's exposed, it's public API" was a good general rule that properly puts the interests of end users above the interests of maintainers.

ljharb commented 7 years ago

@vkurchatkin the definition of "private" seems to have multiple interpretations; mine is that the only things that are private are the things that are inaccessible. Labeling something accessible as "private", by this definition, is precisely "pseudo-private".

refack commented 7 years ago

@Trott AFAICT the issue with @std/esm is that of changing global state (in an unsupported, i.e. untested and unmanaged) way, more than of using internal APIs per se. It could be roughly equated with a dependent module using an experimental feature without notifying the end user.

Trott commented 7 years ago

@Trott AFAICT the issue with @std/esm is that of changing global state (in an unsupported, i.e. untested and unmanaged) way, more than of using internal APIs per se. It could be roughly equated with a dependent module using an experimental feature without notifying the end user

So it's the monkey-patching and whatnot that is the crux of the issue?

nodejs / CTC

CITGM abuse #164