cloudflare / workerd

The JavaScript / Wasm runtime that powers Cloudflare Workers
https://blog.cloudflare.com/workerd-open-source-workers-runtime/
Apache License 2.0
6.15k stars 291 forks source link

🐛 Workers sharing the same modules registry can result in unexpected behaviors #2364

Open dario-piotrowicz opened 3 months ago

dario-piotrowicz commented 3 months ago

As discussed with @jasnell each isolate gets its own global module registry that is shared and reused for the lifetime of the isolate.

Meaning that different worker runs can share the same module registry, in case modules include their own state this means that such state can be shared across different worker runs and this can result in incorrect/unexpected behaviors and even data leakage.

I've seen this creating issues in the wild (https://github.com/cloudflare/next-on-pages/issues/805) and I can imagine this potentially being problematic with any library that relies on top level modules state (but I am not sure how common that is).

Although unrealistic here's an example minimal reproduction of the issue: https://github.com/dario-piotrowicz/workerd-modules-sharing-issue-reproduction/

dario-piotrowicz commented 3 months ago

cc. @IgorMinar @irvinebroque

kentonv commented 3 months ago

Why is this unexpected? Isn't this how it works on all runtimes? A module is only instantiated the first time it is imported, and then reused through the lifetime of the isolate. In Node.js, the one copy of the module would be used for the entire process lifetime.

If you store per-session state in a global you are going to have trouble in any runtime, no?

irvinebroque commented 3 months ago

I think this is an issue of people used to building for web browsers, then trying to build things on the backend.

Ex: consider the example code that Dario shared — if you were to store per-session state in a global, client-side, in a browser, that'd be fine. You're doing so only in the context of a particular user.

Then you develop that same way on the server, without thinking about it, almost just by muscle memory. And you've heard that serverless functions are "stateless", and so you maybe take a mental model that any global or local state you define gets blown away on each invocation, akin to reloading a webpage.

That's where people end up getting confused. (not saying anything about what we should or shouldn't do here, just laying out how I think someone can end up getting mixed up)

kentonv commented 2 months ago

Closing this because it's working as designed. Creating a new isolate for every request would, at present, be way too expensive to be feasible.

(That said, we do actually have some designs on changing that, but it requires deep V8 changes. "Lightweight isolate cloning" is what we call it. Still just an idea at this point.)

jasnell commented 2 months ago

Reopening this as it's still something I'm actively investigating to see if there's anything reasonable we can do here.

dario-piotrowicz commented 2 months ago

PS: sorry for the late reply, I somehow missed the responses 🙇

@kentonv I totally agree with you that this behavior is normal and actually expected. My argument here is that in our platform it can result in unexpected behaviors and, depending on the modules users use this is something that can very likely be out of their control.

I just opened the issue to see if we could improve this as it can be really problematic for developers (who again, might not have viable workarounds for it). I am not sure if something can/should be done here, but I feel like it was (hopefully) worth raising the concern 🙂

PS: I also totally agree with @irvinebroque's comment, but again, my worry is less regarding devs actually doing this but they inheriting unexpected behaviors because of this from libraries.

kentonv commented 2 months ago

Does someone have some kind of proposal for what "we can do here"? We've spent a lot of time discussing this over the years and the only proposal that has ever seemed feasible is lightweight isolate cloning.

jasnell commented 2 months ago

... have some kind of proposal for what "we can do here"? ...

Working on formulating one. Whether I'll be successful in doing so remains to be seen. My current energy is focused on nodejs-compat work but I'll be returning the module registry related stuff shortly and will be actively working on this. In the meantime there's no harm in keeping the issue open and when I've got something more concrete I'll update.

IgorMinar commented 2 months ago

I agree with Kenton here.

Libs should not use module scope or global scope for request/session specific state.

No popular lib for node or other runtimes relies on it as far as I know. But bugs can occur.

It would be awesome if we could provide a per-request state isolation to guard against these bugs, but that is not easy to implement or cheap to operate, so we should consider that as a nice to have improvement and not a hard requirement.

In the meantime when we identify a buggy lib, we should work with its maintainers to fix the bug.

On Thu, Jul 11, 2024 at 3:36 PM James M Snell @.***> wrote:

... have some kind of proposal for what "we can do here"? ...

Working on formulating one. Whether I'll be successful in doing so remains to be seen. My current energy is focused on nodejs-compat work but I'll be returning the module registry related stuff shortly and will be actively working on this. In the meantime there's no harm in keeping the issue open and when I've got something more concrete I'll update.

— Reply to this email directly, view it on GitHub https://github.com/cloudflare/workerd/issues/2364#issuecomment-2222962826, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABUZ2FXHYI7KWPMOXST3YLZL2C7VAVCNFSM6AAAAABKQNBM52VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRSHE3DEOBSGY . You are receiving this because you were mentioned.Message ID: @.***>