cloudflare / workerd

The JavaScript / Wasm runtime that powers Cloudflare Workers
https://blog.cloudflare.com/workerd-open-source-workers-runtime/
Apache License 2.0
6.22k stars 301 forks source link

🐛 BUG: Global scope is reused for all worker code, different from cloud environment #1256

Open aboodman opened 1 year ago

aboodman commented 1 year ago

Which Cloudflare product(s) does this pertain to?

Wrangler core

What version(s) of the tool(s) are you using?

3.6.0

What version of Node are you using?

18.3.0

What operating system are you using?

Mac

Describe the Bug

In the production cloudflare environment the execution environment (global scope) for code is sometimes reused, but only across durable objects of the same class.

So for instance if your wrangler.toml declares two Durable Objects with classes Chess and Checkers, instances of Checkers might sometimes be instantiated in a js context that was previously used by a previous instance of Checkers, but they will never be instantiated in the js context previously used by an instance of Chess.

Further, I'm pretty sure that in the production environment two instances of a DO can never be "running" at the same time inside an execution context (by running here I mean receiving messages). But I have an open question about that here:

https://discord.com/channels/595317990191398933/773219443911819284/1153898195886293122

In Wrangler, however, all code is run in the same js context (global scope) concurrently.

Please provide a link to a minimal reproduction

https://github.com/rocicorp/cf-do-js-exec-context

Please provide any relevant error logs

No response

admah commented 1 year ago

Just to confirm, this is when you're running wrangler dev? Do you also see this behavior when running wrangler dev --remote?

aboodman commented 1 year ago

Yes, the report was from wrangler dev. I just checked wrangler dev --remote and it has a third, different behavior.

Here is the output from the production cf environment:

Worker js exec context id: nlxfszyh
DOClass1 foo response: DO id: 9xuuqefw, js context id: fmz67dwn
DOClass2 bar response: DO id: ah16ukh8, js context id: lxw7futo
DOClass1 foo response: DO id: 9xuuqefw, js context id: fmz67dwn
DOClass2 bar response: DO id: ah16ukh8, js context id: lxw7futo

Note that the worker, DOClass1, and DOClass2 each have their own js context.

Here is the output from wrangler dev:

Worker js exec context id: qff63q1g
DOClass1 foo response: DO id: pxmht8t8, js context id: qff63q1g
DOClass2 bar response: DO id: e23urbkq, js context id: qff63q1g
DOClass1 foo response: DO id: pxmht8t8, js context id: qff63q1g
DOClass2 bar response: DO id: e23urbkq, js context id: qff63q1g

The worker, DOClass1, and DOClass2 all share the same context.

And here is the output from wrangler dev --remote:

Worker js exec context id: e3d1rg6q
DOClass1 foo response: DO id: lgwax42, js context id: nrf9h6kh
DOClass2 bar response: DO id: 6zti8luc, js context id: nrf9h6kh
DOClass1 foo response: DO id: lgwax42, js context id: nrf9h6kh
DOClass2 bar response: DO id: 6zti8luc, js context id: nrf9h6kh

The worker has one JS Context, and the two DOs share a JS Context.

aboodman commented 1 year ago

You can generate this output yourself running the repo attached to this bug.

kentonv commented 1 year ago

In the production cloudflare environment the execution environment (global scope) for code is sometimes reused, but only across durable objects of the same class.

So for instance if your wrangler.toml declares two Durable Objects with classes Chess and Checkers, instances of Checkers might sometimes be instantiated in a js context that was previously used by a previous instance of Checkers, but they will never be instantiated in the js context previously used by an instance of Chess.

Hmm, are you sure about that? I would have expected that anything running the same Worker script could potentially run in the same isolate -- not only multiple DO classes but also stateless requests.

Further, I'm pretty sure that in the production environment two instances of a DO can never be "running" at the same time inside an execution context (by running here I mean receiving messages).

In the case that two objects are colocated in the same isolate I'd expect them to run concurrently.

In Wrangler, however, all code is run in the same js context (global scope) concurrently.

This is as designed. Since all Durable Objects land on the same runtime instance (running locally), they are expected to land in the same isolate and JS context. In general, we try to reuse isolates whenever the same worker is running on the same machine. wrangler dev --remote also runs all code for the preview session on a single machine, but I think it may end up creating separate isolates for stateless requests vs. DO requests by accident.

We could perhaps add a setting in workerd which causes it to intentionally avoid reusing isolates in order to simulate running in production. In fact, maybe it should actually randomly choose whether to reuse an isolate or not, since production will sometimes colocate objects into isolates.

aboodman commented 1 year ago

Hmm, are you sure about that? I would have expected that anything running the same Worker script could potentially run in the same isolate -- not only multiple DO classes but also stateless requests.

This statement came from the following discord message:

https://discord.com/channels/595317990191398933/773219443911819284/1152247116056502302

CleanShot 2023-10-02 at 10 26 32@2x

Note, I specifically care about the global scope (ie v8::Context) not isolates. For the purposes of this bug, I do not care if v8::Isolates are widely reused -- I only care if the global scope is.

In the case that two objects are colocated in the same isolate I'd expect them to run concurrently.

Right, but will they run in the same context concurrently? We have not observed this ever happen with our production code.

kentonv commented 1 year ago

@elithrar Are you sure about the statement screenshotted above?

Note, I specifically care about the global scope (ie v8::Context) not isolates.

Understood, but these are 1:1 in our runtime (at least for all purposes being discussed here).