cloudflare / workers-sdk

⛅️ Home to Wrangler, the CLI for Cloudflare Workers®
https://developers.cloudflare.com/workers/
Apache License 2.0
2.58k stars 667 forks source link

🚀 Feature Request: add test helper to programmatically hibernate a Durable Object #5423

Open nvie opened 5 months ago

nvie commented 5 months ago

Describe the solution

It'd be great to have a test helper to trigger a durable object to hibernate in unit tests, so that the next time a fetch() or Websocket message is received, it will re-run the constructor and restore websockets.

  // Hypothetical new API
import { hibernateDurableObject } from "cloudflare:test";  // ✨🙏✨

test("waking up from hibernation", async () => {
  await SELF.fetch("http://example.com/ws");

  await hibernateDurableObject(stub);

  const numSockets = await runInDurableObject(stub, (instance) => instance.getWebSockets().length);
  expect(numSockets).toBe(1);
});

(Currently, the only option seems to be await sleep(10_000) to trigger hibernation in unit tests?)

Context: https://discord.com/channels/595317990191398933/1218150105777963101/1222851857601531997

mrbbot commented 5 months ago

Recording for posterity, reproduction of segfault when hibernating with 10 second wait: https://github.com/nvie/ws-promise-bug-repro/blob/hibernate/durable-objects/test/illustrate-bug.test.ts#L22-L24.

nvie commented 5 months ago

cc @MellowYarker

This feature request is very much related to the segfault issue (see https://github.com/cloudflare/workerd/issues/1422#issuecomment-2048624355). Fixing the segfault would make testing hibernation possible at all.

But having this proposed programmatic hibernateDurableObject() API would be really amazing to take it a step further, as it would take testing hibernation related code from theoretically possible to practically feasible. Because once that API exists, we no longer have to rely on actual wall clock time passing in our test suite, which makes them unnecessarily slow. The API that @mrbbot proposed in https://github.com/cloudflare/workerd/issues/904#issuecomment-1709234569 would be really really useful to us. (And while at it, maybe it would be possible to also have an explicit way to trigger an evictDurableObject() API?)

To us, this is one of our biggest pain points at the moment, and finding a robust and reliable solution for this would be huge. Happy to discuss details if you have any questions! 🙏

MellowYarker commented 5 months ago

@nvie I think it would be a lot easier to make the timeout configurable than to provide a new API for forcing hibernation. You could just set the timeout to be 100ms (with the default being 10 sec as in production).

The current hibernation process depends on there being no active requests, and having a request that triggers hibernation would break that model. The Workerd implementation tries to mimic what happens in production, and introducing an API that breaks the fundamental assumption would likely make the implementations deviate and introduce discrepancies/other bugs.

Would you be alright with having the time limit be configurable?

nvie commented 5 months ago

Would you be alright with having the time limit be configurable?

Yeah, I can see how that will be easier indeed. Do you think it's possible to make it configurable on a per-test basis somehow? We have tests where we don't want to deal with implicit hibernation and some tests where we deliberately want to trigger/test it. Ideally we can control it and not make the behavior under test dependent on timing too much. Finding the exact right timing will be tricky if it's just one global setting.

How you would recommend going about that?

MellowYarker commented 5 months ago

Do you think it's possible to make it configurable on a per-test basis somehow?

I definitely see the appeal, though I suspect this isn't feasible right now. @petebacondarwin sorry to bug you, but I see you reviewed some of Brendan's PRs related to local dev testing. Do you know if we can configure specific workers/bindings (or even Workerd) for individual tests? If not, do you know who might know?

Part of me thinks it might be possible to set per DO namespace, though I don't know how just yet.