Closed neelance closed 7 months ago
I did not know that you are exposing AsyncLocalStorage
via @temporalio/workflow
. This is perfect and the proper way to implement it. Thanks!
@neelance I'm curious how reuseV8Context
broke isolation for you.
Could you please provide more details?
Here is the concept:
const someList: string[] = [];
export async function testWorkflow() {
someList.push(workflowInfo().workflowId);
sleep(10_000);
await activities.printList(someList);
}
I think you should be able to observe entries from other workflows when you invoke it multiple times.
I can't reproduce this:
https://github.com/temporalio/sdk-typescript/pull/1356/files
Strange. I just reproduced it like this:
let value = 0;
export async function example(): Promise<number> {
value++;
return value;
}
Every time I run this workflow, I get the next number until I restart the worker. When I disable reuseV8Context
I always get 1
. I wonder what's the difference between our setups...
How are you creating your bundle (ignore if you don't know what I mean) and worker?
Yes, that's exactly what I was just looking at and I think I found it: We're building the workflow bundle on our own and you seem to manipulate the __webpack_module_cache__
, which does not seem to work with our bundle. We're still using Webpack, so I guess there should be some way to make it work...
code = code.replace(
'var __webpack_module_cache__ = {}',
'var __webpack_module_cache__ = globalThis.__webpack_module_cache__'
);
🙈
I see, well we definitely don't support that. You're on your own there 😬
Problem is that using your bundler comes with inconsistencies to how we bundle the rest of our codebase...
And well, you're not really discouraging it either: https://docs.temporal.io/dev-guide/typescript/foundations
bundleWorkflowCode
is a utility that we maintain and support.
It's used internally by the worker if you give it a workflowsPath
option.
What are the inconsistencies I wonder. Is using the webpack hook we provide not enough?
I'll have the docs fixed, don't know who put this advice there.
Is using the webpack hook we provide not enough?
I just tried bundleWorkflowCode
and got it to work. We simplified our Webpack setup in the last months, so that helped. Honestly, I'm still not that happy about having a second bundler. I'll think about it some more...
Thanks for the updates.
Let me know if you have more thoughts on the subject.
Here's an update:
We have switched to using bundleWorkflowCode
. It was easy enough to integrate into our current build setup. Especially the Webpack hook helped.
However, we've still found a slightly different workflow isolation bug, which still happens when using your bundler. We were also able to resolve it with AsyncLocalStorage
, but you probably want to look into it. Unfortunately it is again difficult for me to come up with a reproduction, so I will describe what we saw and maybe you can already see the issue:
We saw some workflows getting stuck with a nondeterminism error. However, I was able to replay these workflows locally on the same code without any issue. I then guessed that they will probably resolve when restarting the workers. We then restarted all workers without any code change and the workflows completed fine as expected. This strongly suggests an isolation issue.
The difference to the case above is that this time it is not an execution of workflow A leaking into workflow B, but one execution of workflow A leaking into a replay (on the worker) of the same workflow A. In our case signals were involved: The signal handler was writing the data it received into a global variable which served as a map (the data is keyed). Some specific condition
then waited for an entry in this map. The nondeterminism was due to code at the beginning of the workflow already reading data from the map that should not yet be available until some later signal in the history of the workflow.
Have I explained it well enough? My best guess is that you are not discarding the Webpack module cache when replaying a workflow.
Thanks for the update and detailed report.
I wonder what is going on there, I don’t see a reason why this would happen.
Does disabling reuse v8 context also prevent workers getting into this state?
We just ran into the issue that v1.9.0 broke our workflow code. The issue is that we were using a global variable (global = at the top-level of the TypeScript file) to collect cleanup steps that were supposed to get executed at the end of the workflow. Prior to v1.9.0 this worked fine, because each workflow run had its own set of global variables.
With
reuseV8Context
however it caused the cleanup steps to leak into other workflow runs. If workflow run X added a cleanup step and went to sleep and then another workflow run Y got executed with the same V8 context, then Y would also invoke the cleanup steps of X.This issue was hard to notice during testing. Fortunately we were able to recover with no long-lasting impact.
It seems like a footgun that it is so easy to break isolation now. One improvement would be to make AsyncLocalStorage available to the workflow code. I wanted to use if back when I implemented the cleanup logic, but it was not available so I opted for the global variable instead.