tc39 / proposal-function-implementation-hiding

JavaScript language proposal: function implementation hiding
https://ci.tc39.es/preview/tc39/ecma262/pull/1739
MIT License
98 stars 7 forks source link

Opinion: this will cause lot of "mystery bugs" in Sentry-style applications #50

Open Jabher opened 2 years ago

Jabher commented 2 years ago

Hello. I've checked all issues but did not find one - please correct me if I'm wrong.

I assume this is expected to be a common pragma in library apps: lodash, react, angular, etc. However, we are all not perfect, and sometimes edge-cases that are breaking things are happening - both because of library bugs and browser bugs, even browser plugin bugs. This actually happens quite often, especially speaking not about top-tier libraries and frameworks (and even they too have bugs sometimes).

Moreover, if speaking about my app, I can usually cut them out (but not always - I have to embed e.g. analytics as 3rd party and unable to embed and control this code), but speaking about browser extensions I can definitely expect zero control over there, and it becomes virtually impossible to understand why it's happening.

I've found most of ones in Sentry at first, because - of course - I'm unable to test my web apps in every browser and with every plugin combination in a world.

By introducing this directive the ability to introspect stack traces will be fully broken for that kind of cases when using error stack report tools like Sentry.

Scenario of making Function.prototype.toString return nothing is totally understandable for me, however, I think that hiding the stack traces in errors will produce lot of bugs that are not solvable by normal tooling instruments, and will cause developers to spend way more time on debugging, which will negatively impact the developers speed all over the globe.

Jack-Works commented 2 years ago

You can use your bundler plugin to remove "hide implementation" in the source code.

Jabher commented 2 years ago

I specifically mentioned quite common several cases here that are not covered by your answer:

Most of them, from my point of view, would be happy to provide as little details amount about them as possible, despising the fact they can have bugs too.

ljharb commented 2 years ago

These libs can choose not to use this pragma, and, you can run them through your build process.

If you’re using third party code that uses the pragma, and it’s causing problems for you, then don’t use that code - the existence of a feature isn’t the problem, it’s the way the author is using it. We don’t remove loops just because you can write an infinite one.

Jabher commented 2 years ago

I think we all know that decisions like "let's use Google Analytics" are done not on developer's level, and switching to another framework because there could be non-debuggable bugs will probably be rejected by management in most of cases. What is even worse, there are chances that whole markets of solutions will be using this approach, e.g. all the analytics solutions would like to do this.

Moreover, this can happen in any moment of time, and developer can't basically control this process.

And third part here is the most terrible one: if user is using 3rd-party extensions (which most of users do), this proposal will leave zero chances to understand that he is getting his app crashing because some user is using e.g. LastPass messing with his code with higher priority, because whole plugin stack trace will be wiped from error stack, and this problems will become literally unexplainable.

ljharb commented 2 years ago

I also think we all know that companies like Google aren't going to make their SDKs undebuggable if it's causing problems for developers.

Jabher commented 2 years ago

I also think we all know that companies like Google aren't going to make their SDKs undebuggable if it's causing problems for developers.

Why are you making this assumption? What is it based upon? And how thing "if it's causing problems it is not going to be done" thing should work? Looks like they have to look in a future to not do this. Moreover, this suggestion is actually meaning "this can cause problems so we should probably not use it", which, extrapolating, can be stated as "no one should use this directive, if it is not intended by the application developer", which is opposite to original approach of this proposal.

Moreover, I have 180deg different point of view about "companies will not do this" about based on my 10+ years in various IT companies, including some of companies who were providing 3rd party code and were struggling to hide, obfuscate and prevent it from being stolen - can you please share the links or talks you are basing your point of view, please?

I do not want to brag and compare our experience, so please, if you have something that proofs your point of view please share. I will try to do same for my point of view, otherwise it will not be productive.

ljharb commented 2 years ago

I’m not saying they’ll anticipate the problems. I’m saying even if they ship code using this pragma, and if it causes problems to do so, they will reverse it when those problems are made known. That’s a lot of “if”s, of course - the entire premise of this issue is “looking into the future” - but if you’re using someone else’s code, you have already chosen to either be in thrall to their choices, or, to modify their source code as needed when a conflict arises.

bathos commented 2 years ago

@Jabher It’s already possible to do this today generically (i.e. without engine-specific tools like prepareStackTrace). You use generators and yield * expressions to “unwind” the call stack at every juncture where “user code” could be invoked. The generators “atomize” operations, yielding representations of the 13 “instruction codes” of JS (i.e., invocations of the object internal methods). These atomic instructions are iterated and executed* at the entrypoint. It works because the execution contexts of the internal functions yielding these instructions are truly not on the execution context stack at points where user code may be invoked even though they get pushed back onto it when resumed via next(result)/throw(err), but it’s very inefficient and impacts how such internals are written significantly. This proposal’s stack-redaction directive would solve an existing inefficiency/complexity problem, but it wouldn’t be introducing a new capability.

As @Jack-Works pointed out it would also make it possible for consumers of such code to choose not to redact by applying reliable code transformations. Creating an “include all code in stacks” transformation targeting code written using the generator-unwinding model today could not be genuinely reliable and could not be generic; it would need to be written for specific target code.

* In a Window env, plucking OIM-calling reflection methods from %Reflect% of a nested browsing context which is immediately discarded also keeps their frames out of JS-observable stack traces in some agents. This leaves JS-authored Web IDL/polyfill/virtualization code precisely aligned with its non-JS-implemented companions with regard to stack observation; no difference is left at all where this last piece can be clicked into place.

Jabher commented 2 years ago

That is exactly what I'm talking about. Developers are, often, optimists. I totally understand this beautiful idea "we can make polyfill that acts like original code, even in stack traces". However... polyfill.io has 800+ closed issues. Core-js has same-scale numbers. I understand that not all of them are bugs, some are improvements or proposals, however, it is definitely not normal to expect that there will be no bugs. And with no idea how to debug - this will be catastrophic. Just imagine that there will be a bug in polyfill on specific OS/browser that nobody is able to see even in stacktraces? And if this proposal will be accepted, this definitely will happen, and not even once, but multiple times - just by law of large numbers, and nobody will even have a clue what is happening, if it is provided code (like Google Analytics).

Moreover, speaking of async-to-generator approach - so, you are basically saying that some companies are using regenerator-ish approach not as a way to support older browsers, but as a way to obfuscate stack traces? Sounds a little bit crazy, but I think that some companies can really act in that manner.

Also, generators approach allow to at least see a generator function and have slightest clue what was happening (and place breakpoints). This approach cuts all chances.

bathos commented 2 years ago

@Jabher I tend to agree that a directive isn’t the best way to solve this (at least outside of a niche within an already-niche set of use cases).

you are basically saying that some companies are using regenerator-ish approach not as a way to support older browsers, but as a way to obfuscate stack traces?

No — it’s unrelated to regenerator or obfuscation, and one of the benefits of the generator model is that it actually does allow internal errors to be exposed with full traces: if an unhandled exception is thrown anywhere “deeper” than the top level instruction-executor layer, that’s an internal implementation error definitionally and so you don't want to “redact” it (nor do you need to; it’s already exposing the right internal info “for free”). At least that’s the case for use cases like mine; some people need these tools because of strict restrictions on information side channels and for them that error metadata really must remain unexposed (to runtime JS, that is, not to higher-level debugging tools; debugger stepping is always an option).

You can achieve that expose-traces-for-internal-errors characteristic with other models, too, but I’m not sure how one would achieve it with the directive. I’ll likely be leveraging the source redaction directive and not the stack redaction directive for this reason, especially now that ShadowRealms appear to be addressing this very effectively (same effect as the one previously described, but without the need for “yield *” atomized instructions). Even so, I think you may be picturing a different set of use cases than those that (I think) this proposal seeks to address. In particular my use of the word “polyfill” may have been misleading as typical polyfills wouldn’t benefit from either directive as far as I can tell.

chen-ye commented 2 years ago

What's the purpose of this pragma if you can just transpile it out?

If the purpose of it is to guard against naughty developers/libraries doing something that clearly seems to be bad practice, I fail to see how something that can be easily transpiled out will stop them.

And if it isn't easy to transpile it out, then it doesn't seem like that's a valid answer for making this compatible with error telemetry services (which, imo, is a perfectly legitimate use-case for introspecting error stacks).

I think it makes sense to make it harder to write bad-behaving code, but not when the proposed solution makes it just as hard to do something legitimate. At best, this will just serve as one more barrier to transpilation/preprocessing/bundling-free development.

ljharb commented 2 years ago

You can transpile out any source code or expose things that are hidden - it all still has a purpose.

Jabher commented 2 years ago

You can transpile out any source code or expose things that are hidden - it all still has a purpose.

I provided multiple examples of code you cannot transpile - from browser plugins to 3rd party libs. What is even worse, browser plugins are loading scripts in separate context and you are physically not able to load your own transpiled version into this context (this is a security measure - e.g. look at Metamask). By blocking the ability to read the stack traces you are breaking the chance to debug interactions between plugins and libs

ljharb commented 2 years ago

And just like those things can already obscure stack traces or hide variables in a closure, if they hide this info from you then you’re not meant to have it.

bathos commented 2 years ago

@Jabher when you say “browser plugins,” are you referring to something other than extensions like in Chrome and Firefox? Their “content script” injection model uses “isolated worlds” (same-agent realms that don’t belong to the same js-reachable object graph). If you’re currently able to observe exceptions from those realms from the ordinary browsing context at all (much less their stacks), that’s a bug in the browser.

Some extensions inject a second time from their content scripts through DOM sinks that permit evaluating code in the corresponding non-isolated-world realm as though it were trusted by its origin*. In these cases, you could currently observe errors/stacks from the secondary injections and yes, such extensions make those errors more opaque with this, but they’re generally pretty opaque already given they’re runtime-eval snippets for proxying stuff, not the content scripts themselves. The most interesting bits are already not reachable unless the extension went out of its way to share them on purpose, so I don’t think much could be lost here.

* Almost always, it decidedly isn’t! This is what lead to falsifying, stripping, and/or inadvertantly adding random new privileges with bad regexp patterns (kaspersky) to the security policies of arbitrary sites becoming an ecosystem norm. The official fiction is that it’s the user doing the trusting, but I really don’t think any language design decision should be based on concessions to a practice that we should be working towards eliminating for the sake of the web’s security model. Safari’s healthier extension ecosystem has shown these things aren’t necessary.

Jabher commented 2 years ago

yes, I was mostly speaking about second case (as this is the most frequent integration case).

Do you understand that this thesis sounds like "OK, you are in a dark room anyway, so let's throw away the last candle"?

Web is already complicated and confusing, growing up as a developer is way harder than 5 or 10 years ago, it is way more confusing in complex manner (not "why 1+1 is 11", but "why I cannot access credentials when I pass origin * to CORS", or "why frame rate drops when I try to animate DOM element with shadows, which leads to lower google score and worsen SEO", or "how to implement PWA when iOS is removing cookies from it every week", or "how to debug 3rd party services which are hijacked by country-level censorship and/or provider placing ad spam instead of content I've provided") and this proposal is making things worse.

Please, lit the fires, not shut them down. Web world is already hard and obscure in many means, including browser wars - on OS level now, not like "in goodies-oldies 90s" when everybody just had proprietary features, including corporates fighting for ROI and MAU, and personal data of each of us, governments trying to prevent their citizens to access the truth.

I'm sure we all here love web, but this proposal can be one of steps that will convert free web application users - like GitHub we're using now - into fully-company-controlled applications, because this can make web development harder. We all here do not know the impact, but we know there will be impact.

10% web development costs raise will cause some companies to build on iOS or Android, not web. Please, rethink this proposal. We need web to be clear, understandable, readable, supportable, not mysterious and obscure.

ljharb commented 2 years ago

This proposal has no bearing on that - people already minify and obfuscate their code. Hiding runtime toString inspection doesn’t make anything worse.

Jabher commented 2 years ago

ummmm... I'm not saying about toString, I'm saying about stack removal here.

ljharb commented 2 years ago

The same is true. Anybody can already .bind or try/catch their functions and obscure the stack trace. You can’t rely on getting that info from someone who doesn’t want to give it to you.

Jabher commented 2 years ago

you are referencing to yourself, so I will reference to myself too:

this thesis sounds like "OK, you are in a dark room anyway, so let's throw away the last candle"

pcjmfranken commented 1 year ago

If you’re using third party code that uses the pragma, and it’s causing problems for you, then don’t use that code - the existence of a feature isn’t the problem, it’s the way the author is using it. We don’t remove loops just because you can write an infinite one.

If developers rely on implementation details or stack traces that they don't control and that's causing problems for them, then they should rethink or remove this reliance.

The dependency having changed up some undocumented internals isn't the problem, it's that some cowboy decided to rely on an internal/private API and failed to properly mitigate the obvious risks.

taylorhakes commented 1 year ago

I would like to echo the concerns about this proposal in this thread.

The goal of this proposal seems to be to protect people from relying in implementation details, but this doesn’t accomplish that goal. The source code is still visible from the original file and declarations can be compiled out. This proposal just adds to confusion when debugging code. Errors in library code are super important to debugging issues. I have done it a lot personally and at work.

There has been this thought that “don’t use the libraries that use the declaration” if you don’t like it. That is way easier said than done. There aren’t always viable alternatives. Why give them a power that is bad.

This proposal gives a lot of power to library authors that I feel strongly makes the web worse. It makes debugging harder, adds extra steps to build processes and doesn’t accomplish the goal of hiding details.

bathos commented 1 year ago

The source code is still visible from the original file

@taylorhakes that’s not a language-level runtime introspection capability. it’s irrelevant to the goals of this proposal.

taylorhakes commented 1 year ago

That is one part of my issue with the proposal. Although I agree with this statement.

that’s not a language-level runtime introspection capability

I don’t agree that it’s irrelevant though. It’s part of the larger picture of relying on implementation details. This proposal only protects against a specific way of relying on implementation details.

ljharb commented 1 year ago

Yes, that specific way is the one that needs protection.

taylorhakes commented 1 year ago

It doesn’t really accomplish the goal though. What is stopping angular or other libraries from using a compile time inspection? This is trying to stop something by only stopping a small part of it, but also has significant downsides. If this proposal didn’t have an effect on Error.stack and most debugging on the web, I would be all for the goal of protecting implementation details.

ljharb commented 1 year ago

Nothing. It's impossible to prevent compile-time stuff, and thus that's not an important goal.

The downsides are only about visibility into code you don't control, and if that bothers you, you can use compile-time tools to remove the directive.

taylorhakes commented 1 year ago

If people just move their implementation relying code to do the same at compile time, this proposal is effectively useless. It does nothing to fix the issue.

The idea that people will use this proposal in some kind of goldilocks way is vey unlikely IMO. It will either be used way too much and everyone will all be adding compile time steps to remove these statements or people will not use it at all.

bathos commented 1 year ago

Perhaps the word “reliance” has thrown people off because some folks read this as “author reliance”?

The proposal concerns only runtime information channels within ECMAScript code, not information available to authors and publishers of ECMAScript code. These are very different concepts, and the proposal not attempting to do something about the latter is not failure to meet its goal because that just ... is not its goal.

If people just move their implementation relying code to do the same at compile time, this proposal is effectively useless. It does nothing to fix the issue.

It ... does, actually (it makes the “owner of the problem” the “problem maker”). But note that this is not the sole use case for the proposal, either. Other examples include code that you control needing to close information side channels available to code you do not control in a shared environment and faithful implementation of host constructs (Web IDL interfaces, etc) using ES code.

taylorhakes commented 1 year ago

It ... does, actually (it makes the “owner of the problem” the “problem maker”)

I don’t agree it does. If I use angular and it tells me to add a build step to make it work, you could I guess blame me as the user. But you could already do that for me using angular with the current runtime code in the first place.

Other examples include code that you control needing to close information side channels available to code you do not control in a shared environment and faithful implementation of host constructs (Web IDL interfaces, etc) using ES code.

I am not suggesting there isn’t a benefit to this feature. The above being one of them. I am suggesting the goal of getting people to not rely on implementation details is not going to work.

Even if it did accomplish all goals suggested, the downside of losing stack traces outweighs all benefits. Yes, people can modify the code as a compile time step, but most people won’t and we will be left with vague stack traces.

ljharb commented 1 year ago

The goal isn't to make it impossible, since as you've pointed out, that's impossible. It's to make the author's intentions clear enough, and doing it hard enough, that anyone trying it is automatically wrong, and it's an automatic demerit against a project violating those intentions. That is good enough for me.

taylorhakes commented 1 year ago

The goal isn't to make it impossible, since as you've pointed out, that's impossible. It's to make the author's intentions clear enough, and doing it hard enough, that anyone trying it is automatically wrong, and it's an automatic demerit against a project violating those intentions. That is good enough for me.

Sure. I feel we are listing all the positives and ignoring the downsides though. If we assume all third party authors have good intentions, this proposal sounds great. Based on my experience, some authors will use this correctly and a lot will just use this proposal liberally under the guise of protecting implementation details. Unfortunately this proposal has no ability to protect against bad usages. I predict this will just add another compile time step for everyone to remove all these statements because some subset use it badly.

CherryDT commented 12 months ago

I want to point out that the ability to do deep introspection as well as creative monkey-patching (which was already impacted by things like symbols and private class fields) is, at least for some people like me, very important because not everyone will be able to work on issues with a similar urgency as oneself. Once you identified a third-party issue, you can then at least implement a workaround in your own code without depending on the third party to fix the problem, and in parallel report the problem to them (hoping for a proper fix eventually, at which time you can remove your workaround).

I had the case many, many times in the past that a problem during development occured due to some third-party code and there was no way to get it resolved with involvment of that third party, at least not as quickly as it was required (within a few hours), and additionally it was already hard to identify the exact cause and it was only possible by using breadcrumbs like the stack trace and Function.prototype.toString() leading to the problematic piece code and allowing me to inject some log statements to figure out what's going wrong.

It's worrying for me that the abilities as a developer to have full control and insight over/into the code that's running in your project and the agency to solve problems happening in your project yourself quickly are getting cut down further and further.

ljharb commented 12 months ago

@CherryDT that ability is already not guaranteed in the language, since closures exist. If you want to change someone else's code, or access something they don't expose, you have to do it at build time, not at runtime. For example, that's what forks are for.

bathos commented 12 months ago

I’d also emphasize that this has nothing to do with introspection through external tooling. Almost all of the most useful introspection capabilities devtools provide for debugging already aren’t available to runtime code. Whether code makes use of the proposed directives or not has no bearing on that layer.