temporalio / sdk-typescript

Temporal TypeScript SDK
Other
544 stars 109 forks source link

[Bug] RangeError: "length" is outside of buffer bounds #1510

Closed dandavison closed 2 months ago

dandavison commented 2 months ago

With node v22.7.0 the workflow interaction below crashes the worker with the error below in protobufjs. This appears to be the same problem as discussed in https://github.com/protobufjs/protobuf.js/issues/2025, and is fixed by upgrading node to v22.8.0.

  error: RangeError: "length" is outside of buffer bounds
      at Buffer.proto.utf8Write (node:internal/buffer:1066:13)
      at Op.writeStringBuffer [as fn] (/Users/dan/src/temporalio/sdk-typescript/node_modules/protobufjs/src/writer_buffer.js:61:13)
      at BufferWriter.finish (/Users/dan/src/temporalio/sdk-typescript/node_modules/protobufjs/src/writer.js:453:14)
      at Worker.handleActivation (/Users/dan/src/temporalio/sdk-typescript/packages/worker/src/worker.ts:1166:30) {
    code: 'ERR_BUFFER_OUT_OF_BOUNDS'
repro ```typescript import * as wf from '@temporalio/workflow'; import * as cl from '@temporalio/client'; import * as wo from '@temporalio/worker'; const workflowId = 'wid'; const taskQueue = 'tq'; export const fetchAndAdd = wf.defineUpdate('fetchAndAdd'); export async function workflow(): Promise { var count = 0; const handler = (arg: number) => { const prevCount = count; count += arg; return prevCount; }; const validator = (arg: number) => { if (arg < 0) { throw new Error('Argument must not be negative'); } }; wf.setHandler(fetchAndAdd, handler, { validator }); await wf.condition(() => count != 0); } async function starter(client: cl.Client): Promise { const wfHandle = await client.workflow.start(workflow, { taskQueue, workflowId, workflowIdReusePolicy: cl.WorkflowIdReusePolicy.WORKFLOW_ID_REUSE_POLICY_TERMINATE_IF_RUNNING, }); await wfHandle.executeUpdate(fetchAndAdd, { args: [-1] }); } async function main(): Promise { const worker = await wo.Worker.create({ workflowsPath: __filename, taskQueue, bundlerOptions: { ignoreModules: ['@temporalio/client', '@temporalio/worker'], }, }); const connection = await cl.Connection.connect(); const client = new cl.Client({ connection }); await worker.runUntil(starter(client)); } if (!wf.inWorkflowContext()) { wo.Runtime.install({ logger: new wo.DefaultLogger('WARN') }); main().catch((err) => { console.error(err); process.exit(1); }); } ```
Full worker error log ``` 2024-09-08T12:55:58.594Z [ERROR] Worker failed { sdkComponent: 'worker', taskQueue: 'tq', error: RangeError: "length" is outside of buffer bounds at Buffer.proto.utf8Write (node:internal/buffer:1066:13) at Op.writeStringBuffer [as fn] (/Users/dan/src/temporalio/sdk-typescript/node_modules/protobufjs/src/writer_buffer.js:61:13) at BufferWriter.finish (/Users/dan/src/temporalio/sdk-typescript/node_modules/protobufjs/src/writer.js:453:14) at Worker.handleActivation (/Users/dan/src/temporalio/sdk-typescript/packages/worker/src/worker.ts:1166:30) { code: 'ERR_BUFFER_OUT_OF_BOUNDS' } } CombinedWorkerRunError: Worker terminated with fatal error in `runUntil` at Worker.runUntil (/Users/dan/src/temporalio/sdk-typescript/packages/worker/src/worker.ts:1608:15) at processTicksAndRejections (node:internal/process/task_queues:105:5) at async main (/Users/dan/src/temporalio/samples-typescript/scratchpad/scratchpad.ts:48:3) { cause: { workerError: RangeError: "length" is outside of buffer bounds at Buffer.proto.utf8Write (node:internal/buffer:1066:13) at Op.writeStringBuffer [as fn] (/Users/dan/src/temporalio/sdk-typescript/node_modules/protobufjs/src/writer_buffer.js:61:13) at BufferWriter.finish (/Users/dan/src/temporalio/sdk-typescript/node_modules/protobufjs/src/writer.js:453:14) at Worker.handleActivation (/Users/dan/src/temporalio/sdk-typescript/packages/worker/src/worker.ts:1166:30) { code: 'ERR_BUFFER_OUT_OF_BOUNDS' }, innerError: ServiceError: Workflow Update failed at WorkflowClient.rethrowGrpcError (/Users/dan/src/temporalio/sdk-typescript/packages/client/src/workflow-client.ts:754:13) at WorkflowClient.rethrowUpdateGrpcError (/Users/dan/src/temporalio/sdk-typescript/packages/client/src/workflow-client.ts:739:10) at WorkflowClient._startUpdateHandler (/Users/dan/src/temporalio/sdk-typescript/packages/client/src/workflow-client.ts:841:12) at processTicksAndRejections (node:internal/process/task_queues:105:5) at async _startUpdate (/Users/dan/src/temporalio/sdk-typescript/packages/client/src/workflow-client.ts:1113:22) at async Object.executeUpdate (/Users/dan/src/temporalio/sdk-typescript/packages/client/src/workflow-client.ts:1190:24) at async starter (/Users/dan/src/temporalio/samples-typescript/scratchpad/scratchpad.ts:35:3) at async /Users/dan/src/temporalio/sdk-typescript/packages/worker/src/worker.ts:1597:16 at async Promise.allSettled (index 0) at async Worker.runUntil (/Users/dan/src/temporalio/sdk-typescript/packages/worker/src/worker.ts:1604:44) { cause: [Error] } } } ```
mjameswh commented 2 months ago

What are you suggesting? Looks like that's a bug in Node itself, and there's really nothing we can do on our side…

dandavison commented 2 months ago

We're keeping this issue in the sdk-typescript repo so that anyone who hits the problem while using the Typescript SDK will find it, and learn that it's a node issue that can be resolved by using a different node version (we've already seen reports from users who've hit it and assumed it was a problem with sdk-typescript).

Beyond that, it looks like this affects node v22.7.0 only. I'm not sure what proportion of our users will be likely to attempt to use that node version in the future. If this is going to be a common problem for users, one thing that might be possible is to catch the error and output a helpful error message.

hejbiLLLLL commented 2 months ago

Hi!

I am experiencing this problem whilst using node LTS (v20.17.0) and/or v22.9.0 and temporal v1.1.0 on a Mac (M1). My colleague running the same versions on Linux can't replicate the issue running the exact same code.

Any advice?

EDIT: Solved! We had mismatching versions of the temporal packages:

"@temporalio/activity": "^1.11.2",
"@temporalio/common": "^1.11.2",
"@temporalio/worker": "^1.11.2",
"@temporalio/workflow": "^1.11.2"

Some were 1.11.1 while some were 1.11.2, so bumping them all to the same version solved the issue.