tc39 / proposal-async-context

Async Context for JavaScript
https://tc39.es/proposal-async-context/
Creative Commons Zero v1.0 Universal
583 stars 14 forks source link

Will it not introduce a bad habit? #49

Closed lifaon74 closed 1 year ago

lifaon74 commented 1 year ago

Are you not introducing a "bad design" solution ?

Let me explain: If I'm correct, "Async Context" is close to ZoneJs trying to create a context across functions (async or not). This design, like all the exemples you're providing, works with impure functions (some global variables are updated, and such functions have side-effects). This is usually a bad design, and especially may create "unwanted" behaviors for new developers, resulting in potentially bugged applications, with very hard bug fixing.

In my opinion, shared by many, impure functions should be avoided when another another alternative is possible.

For example:

export type LogFunction = (...args: any[]) => void;

export function run<T>(id: string, cb: (log: LogFunction) => T) {
  return cb((...args: any[]) => {
    console.log(`[${id}]`, ...args);
  });
}
document.body.addEventListener('click', () => {
  const id = new Uuid();

  run(id, (log: LogFunction) => {
    log('starting');

    // Assume helper will invoke doSomething.
    helper(doSomething, log);

    log('done');
  });
});

function doSomething(log: LogFunction) {
  log("did something");
}

For sure, this is just a little more verbose, but it is safer too !

jridgewell commented 1 year ago

First, it is not possible to use closures to fix every problem. The example I presented during the first meeting was about patching console.log() to capture the current server request's id/start-time/etc in a way that developers don't need to be aware of the patch. In your solution, every developer that wishes to log must accept this LogFunction in order to properly log on the server platform. Because most devs (and certainly all node modules) are just going to call console.log(), that requirement prevents us from achieving the goal. Imposing on the dev to use my platform's special logging API just means we're going to lose that business.

Due of the async nature of servers, and the need to handle multiple requests at in parallel, it's not possible to just patch console.log = logFn. The only pattern that can reasonably achieve this is AsyncContext. Same with the OpenTelemetry usecase, and the task scheduling, etc. All of these need to work seamlessly, because the library author isn't in control of the developer code.


Second, this "pass closures everywhere" patten causes performance and memory issues (the allocation of the closure itself is expensive, considerably more so than just invoking a module-level function). Forcing closure-style programming has negative affects on system because of this.

Going back to the console.log() example, we actually tried to close over the console object for every request. That means we had to re-evaluate the developer's entire application within a function:

function handleRequest(req, res) {
  // Developer code will reference this console binding, not the global.
  const console = patchConsole(…);

  [insert all developer code as a bundle at this point]
}

Naturally, reevaluating the developer's application for every request degraded performance sooo much that this solution wasn't acceptable.


Third, I think your definition of function purity doesn't actually apply here. While it's true that is "global state" in the most technical terms, AsyncContext operates more like a function parameter than a globally-mutated variable. Take for example an add function:

function add(a, b) {
  return a + b;
}

add(1, 1);

This is a classic example of a pure function, and its output is derived only from its inputs. But, what if we write it with AsyncContext?

const ctx = new AsyncContext();

function addWithCtx(b) {
  const a = ctx.get();
  return a + b;
}

This certainly appears to reference global state. But, the way AsyncContext is used means that the context isn't really mutated, it's invoked like a function:

ctx.run(1, () => {
  assert.equal(addWithCtx(1), 2);
});

The only way to mutate the context is to invoke .run() in your call stack. And for entire duration of the run, your program can only observe that state. That is functionally the same as if I closed over the a to begin with:

ctx.run(1, () => {
  const a = ctx.get();
  implicitClosure(a);
});

// basic transform to remove ctx use
function implicitClosure(a) {
  assert.equal(add(a, 1), 2);
}

I'd argue that functions using AsyncContext are just as pure as functions only accepting parameters, because their behavior is explicitly controlled by the call stack..

fluid passing style This is kinda complex rewrite of the prior using "[fluid passing style](https://github.com/endojs/endo/blob/506a9685b62e5694a6a47a0efa05742e0c91fa71/packages/eventual-send/src/async-contexts/7-fluid-passing-style-transform.md)" from https://github.com/endojs/endo/pull/142. I find this kinda rewrite to be extremely technical. Or, if you prefer, we can imagine the global context state being passed to every function call (this is the approach the SES folks use to justify it's safety): ```js // Before const ctx = new AsyncContext(); function addWithCtx(b) { const a = ctx.get(); return a + b; } ctx.run(1, () => { assert.equal(addWithCtx(1), 2); }); // After let *ImplicitGlobalContext* = new Map(); const ctx = new AsyncContext(); function addWithGlobalContext(b, *ImplicitGlobalContext*) { const a = implicitGlobalContext.get(ctx); return a + b; } // Code is transformed internally to passing ImplicitGlobalContext everywhere *ImplicitGlobalContext*.set(ctx, 1); (() => { assert.equal(addWithGlobalContext(1, *ImplicitGlobalContext*), 2); })(); ```
lifaon74 commented 1 year ago

First thanks for you clear and detailed answer. I see your points here and I agree in some aspects, however, I'll try to advocate for their limits:

1) console.log

Most of js libraries that logs data accepts somewhere in their constructor/init function a LogFunction or something similar. If none, this is mostly a design issue. In this case, usually, developers do a PR and the problem is solved. Asking to update ecmascript itself seems a little overkill. In my opinion, trying to patch something should be done on the code itself, not on the design of the language.

2) performances

Your assumption requires at least a real demo with a fair comparison between closure vs non closure. In my experience so far, loving functional programming, I may say that vendors have done an excellent job with their engines, and the performances are usually identical. In case of performance issues, this is mostly due to a bad designs.

3) function purity

You got a point. You're right, we may consider that the function are pure in some way. However, they still force us to have a context from a global scope (or at least a scope outside the function itself), which is usually a bad design.

ljharb commented 1 year ago

@lifaon74 sometimes the console log is done 50 layers deep - that's not simply making a PR.

Flarna commented 1 year ago

I think it's similar as for thread local available in a lot other languages. There are a lot variants to use it in a bad way but thread local isn't per se a bad thing.

jridgewell commented 1 year ago

Most of js libraries that logs data accepts somewhere in their constructor/init function a LogFunction or something similar. If none, this is mostly a design issue. In this case, usually, developers do a PR and the problem is solved.

Let's consider a debugging use case for the moment. The server setup we run isn't node, so there's no real console object, and no way to console.log() to see what a value is. In order to add a log call durning debugging, this would require me to plumb a LogFunction through the entire call graph to the one place we're logging.

And logging is just one use case here. For OpenTelemetry to work, you would need to accept a TraceSpan parameter. I don't think a random node module is going to accept my PR to add a span.

In my experience so far, loving functional programming, I may say that vendors have done an excellent job with their engines, and the performances are usually identical. In case of performance issues, this is mostly due to a bad designs.

Having worked with V8 and its engineers for years now, this is absolutely not the case. For a single function allocation that is then used throughout a program, it would be unnoticeable. But as soon as you're allocating closures that persist, and doing so in parallel with other processing, it is considerably slower.

And I have the actual data from us running this last year. We had considerably more CPU usage when using the closed over console, reduced response times, and memory pressure.

However, they still force us to have a context from a global scope (or at least a scope outside the function itself), which is usually a bad design

I think this is a personal preference, and not something that can be reasoned away. I think a log of new language features can be useful, and if overused or done so in a bad way, then they can actually harm the understanding of the program.

Used when appropriate, AsyncContext is necessary for cases that I care about. I would absolutely not recommend you rewrite an add function to use one, though, so I certainly recognize that it's not appropriate everywhere.

legendecas commented 1 year ago

This is usually a bad design, and especially may create "unwanted" behaviors for new developers, resulting in potentially bugged applications, with very hard bug fixing.

I disagree with the assumption that impure functions are bad design. Impure functions are essential parts of the language, like Math.random(), Date.now(), and platform APIs, and are necessary in real-world use cases.

However, they still force us to have a context from a global scope (or at least a scope outside the function itself), which is usually a bad design.

AsyncContext doesn't force people to rewrite their mathematical pure functions to be impure. Closures can already capture variables outside it to make side effects and AsyncContext instances have no difference from other variables in such cases.

bryanrideshark commented 1 year ago

You can write bad code with any technique - But what this proposal does, is add a capability to the runtime which is previously impossible to use.

Javascript doesn't have threads, but it does have the micro-task queue and the task queue. Many languages have thread-local data; C#, Java, and Rust are among them. https://en.wikipedia.org/wiki/Thread-local_storage#Language-specific_implementation

Javascript is the lingua franca of the web - We should not restrict the styles of programming that can be accomplished merely because we think it's a "bad habit". The existence of libraries like Zone.JS, Signals, and other massively adopted techniques indicate there is a problem developers are trying to solve - Having thread-local storage would make it possible to streamline or maybe even eliminate these arcane practices adopted by frameworks today.

I'd rather have sharper tools, rather than having to invent pretzels around them. Thread-local storage is a proven technique, and while it might not suit all programs, the programs that it is suited for would benefit tremendously from it being available.

robbiespeed commented 1 year ago

The example I presented during the first meeting was about patching console.log() to capture the current server request's id/start-time/etc in a way that developers don't need to be aware of the patch

@jridgewell while I understand the desire to be able to patch console.log in such a way since it would allow tracking any logged warnings to a specific origin. In my opinion this is a case of the cons outweighing the pros. Cons being that allowing such behaviour encourages hidden state changes unknown to the user, as well the debug issues that arise from that. Imagine for a second that some bug originates from using AsyncContext tracking down such a bug would be a nightmare especially if it's inside a 3rd party library rather than 1st party code.

It would be much better to encourage explicit context passing, and for libraries to allow loggers to be passed as an optional during function calls. Explicit context passing is already the norm in my experience with server side development. Usually by creating a RequestContext object that gets passed to any inner functions that need it, and a logger is usually attached to the context object. It makes it very clear where the logger and further context is coming from.

While it's true that this sort of hidden state is currently possible in synchronous flows, the fact that AsyncContext doesn't exist is an important barrier to further hidden state and worse ability for developers to grok the flow of information both during debugging and learning a codebase.

@bryanrideshark the JS event loop is not a thread, so it doesn't follow that they should have all the features of threads. Workers would be closer to true threads, and since they have their own global scope, by default they have the equivalent functionality of thread local storage. You mentioned C#, and Rust, both have async tasks and as far as I'm aware no equivalent to AsyncContext.

Flarna commented 1 year ago

You mentioned C#, and Rust, both have async tasks and as far as I'm aware no equivalent to AsyncContext

At least .NET has AsyncLocal. This has even more features like the proposal here (e.g. set a callback to get notified when state changes, set a value without scoping in a function).

robbiespeed commented 1 year ago

I stand corrected, tokio (popular Rust async runtime) also has an equivalent https://docs.rs/tokio/latest/tokio/task/struct.LocalKey.html

legendecas commented 1 year ago

Closing as the question has been well answered.