oven-sh / bun

Incredibly fast JavaScript runtime, bundler, test runner, and package manager – all in one
https://bun.sh
Other
73.02k stars 2.66k forks source link

AbortController usage in module scope crashes bun with uncatchable error #9429

Closed tobowers closed 1 month ago

tobowers commented 5 months ago

What version of Bun is running?

1.0.30 (but tested on 1.0.21 and 1.0.26 as well)

What platform is your computer?

darwin 23.4.0 arm (but crashing on our debian linux server too)

What steps can reproduce the bug?

Usage of the AbortController can crash bun in ways that are uncatchable with a AbortError: The operation was aborted. logged to the terminal. The process exits with a 1 status code .

Here's a very simple example:

import { setTimeout as sleep } from "node:timers/promises";

const abortController = new AbortController();

async function test() {
  await sleep(
    2000,
    () => {
      console.log("Timeout");
    },
    { signal: abortController.signal }
  );
}

test().catch((e) => {
  console.error("caught:", e);
});

abortController.abort();

We were seeing real-world usage using the OpenAI library and aborting while a response is streaming:

```typescript it("fails", async () => { const controller = new AbortController() const client = new OpenAI({ fetch: fetch }); const resp = await client.chat.completions.create( { model: "gpt-4-0125-preview", messages: [{ role: "system", content: "Answer in very long winded responses", }, { role: "user", content: "What is the meaning of life?", }], stream: true, }, { signal: controller.signal, } ) setTimeout(() => { controller.abort() }, 150) for await (const res of resp) { try { console.log(".") } catch (err) { console.error('err: ', err) } } return new Promise((resolve) => setTimeout(resolve, 150)) }) ```

Additional failure cases here with a simple SSE streaming server and a client that gets interrupted. https://github.com/dooart/temp-bun-abort-controller

What is the expected behavior?

Expect these errors to be catchable.

What do you see instead?

bun terminates the process with an exit code of 1

Additional information

No response

anthonykrivonos commented 5 months ago

+1. As a workaround, using import fetch from 'node-fetch'; avoids this issue for plain HTTP requests. Still, this error is unrecoverable and we can't use the OpenAI SDK ourselves for streaming right now.

Edit: We've quelled our issue by adding the fetch key to the OpenAI client (more info here) and setting the value to a locally-imported node-fetch. This remains an issue with the global Bun implementation.

Zig1375 commented 3 months ago

are there any news about this issue?

Zig1375 commented 1 month ago

Hello… is there any progress here? The issue still persists in version 1.1.20.

cirospaciari commented 1 month ago

I can reproduce in 1.0.30 but not anymore in the last canary version (1.1.22) please reopen if is still happening after updating.

randompixel commented 1 week ago

I've spent the last two weeks trying to work out what is going on with the code here after upgrading to the latest Azure OpenAI SDK, and the issue is still occuring in Bun v1.1.26 with @azure/identity:4.5.0-beta2 and openai:4.56.1

Minimal test case here:

# index.ts
import { DefaultAzureCredential, getBearerTokenProvider } from "@azure/identity";
import { AzureOpenAI } from "openai";
import type { ChatCompletionMessageParam } from "openai/src/resources/index.js";

try {
    Bun.serve({
        async fetch() {
            try {
                const azureADTokenProvider = getBearerTokenProvider(
                    new DefaultAzureCredential(),
                    "https://cognitiveservices.azure.com/.default",
                );
                const endpoint = "<your-azure-deployment-url>";
                const deployment = "<your-azure-deploymentId>";
                const apiVersion = "2024-05-01-preview";
                const backend = new AzureOpenAI({ azureADTokenProvider, endpoint, apiVersion, deployment });
                const messages: ChatCompletionMessageParam[] = [
                    { role: "user", content: "Hello" },
                ]

                const response = await backend?.chat.completions.create({ messages });

                return Response.json(response);
            } catch (error) {
                console.error("Local");
                console.error(error);
                return Response.json(error);
            }
        }
    });
} catch (error) {
    console.error("Global");
    console.error(error);
}

If you hit F5 a couple of times on the above code it'll crash the entire server with:

21 |  * }
22 |  * ```
23 |  */
24 | class AbortError extends Error {
25 |     constructor(message) {
26 |         super(message);
             ^
AbortError: The operation was aborted.
      at new AbortError (/.../<app>/app/node_modules/@azure/identity/node_modules/@azure/abort-controller/dist/commonjs/AbortError.js:26:9)
      at /.../<app>/app/node_modules/@azure/identity/node_modules/@azure/core-rest-pipeline/dist/commonjs/nodeHttpClient.js:282:25
      at abort (native:1:1)
      at /.../<app>/app/node_modules/@azure/identity/node_modules/@azure/core-rest-pipeline/dist/commonjs/nodeHttpClient.js:134:26

Neither local or global catch statements are hit in the above code, just a hard crash and zero output on the CLI leading you to think the server is still running but nothing is served any more.

Strangely it doesn't always fail immediately and acts more like a fast memory leak where it can take longer and longer to respond from the API until it fails.

I don't know if this issue is related to https://github.com/Azure/azure-sdk-for-js/issues/29112 or not but it seems impossible for me to fix "out of the box" although I will try the above "custom fetch" workaround.

randompixel commented 1 week ago

Alas no, the workaround isn't working for me. Tried passing fetch or node-fetch in to the AzureOpenAI class and both still crash on the second request with the above stack trace.

Looking at the stack trace it seems more related to the fetching of the tokens in the identity library rather than the actual calls to OpenAI. I have also tried passing an AbortController instance (and the old node package) to that but no difference

const azureADTokenProvider = getBearerTokenProvider(
    new DefaultAzureCredential(),
    "https://cognitiveservices.azure.com/.default",
    {
        abortSignal: new AbortController().signal,
    }
);

@cirospaciari can you re-open this or shall I create a new issue?

mjp0 commented 1 week ago

I'm also having major issues with this while trying to cancel streaming with OpenAI. I've wrapped everything in try/catch, but it's truly uncatchable. This crashes Bun completely, and there are no remedies that seem to work. In my opinion, this needs high priority fix. I can't put aborting requests into production because when one user aborts their stream, it effectively aborts everyone else's too, as the process crashes.

Bun version v1.1.26 (macOS arm64).

240 |       try {
241 |         abortController.abort()
                              ^
AbortError: The operation was aborted.
 code: "ABORT_ERR"
Jarred-Sumner commented 1 week ago

@randompixel try adding an error callback in Bun.serve.

@mjp0 please open a new issue with a reproduction and we would be happy to help you. The error message there is not quite enough for us to help. Also, you should be able to use process.on("unhandledRejection", cb) or process.on("uncaughtException", cb)

mjp0 commented 1 week ago

@randompixel try adding an error callback in Bun.serve.

@mjp0 please open a new issue with a reproduction and we would be happy to help you. The error message there is not quite enough for us to help. Also, you should be able to use process.on("unhandledRejection", cb) or process.on("uncaughtException", cb)

That worked, the process stays alive now! I'm just shaking my head that why didn't I think of listening the process directly 🤦🏼‍♂️

Sorry about not providing an example. I browsed a few of these issues and people had posted examples same as mine, so here's one from https://github.com/oven-sh/bun/issues/9805. It's basically your standard abortion.

const abortController = new AbortController();

async function test() {
  try {
    const response = await fetch('https://bun.sh/', {
      signal: abortController.signal
    });

    const stream = response.body;
    // @ts-ignore
    for await (const part of stream) {
      abortController.abort();
    }
  } catch (e: any) {
    if (e.name === 'AbortError') {
      console.log('Request was aborted');
    } else {
      console.error('An error occurred', e);
    }
  }

  console.log('Test completed');
}

test();

With the process monitoring, here's what comes out.

process.on("unhandledRejection", (reason, promise) => {
  console.error("Unhandled Rejection at:", promise, "reason:", reason);
});
> Unhandled Rejection at: Promise { <rejected> } reason: 236 |   const abortSignal = abortController.signal

I don't know how to provide more helpful information but please do ask. I'd like to get this fixed as well in more robust way.

Jarred-Sumner commented 1 week ago

@mjp0 that particular example does not reproduce the issue.

Request was aborted
Test completed

This behaves consistent with Node.

mjp0 commented 1 week ago

@mjp0 that particular example does not reproduce the issue.

Request was aborted
Test completed

This behaves consistent with Node.

Looks like Vercel's AI lib might have something extra special then.

import { createOpenAI } from "@ai-sdk/openai"
import { streamText } from "ai"

const abortController = new AbortController()
const abortSignal = abortController.signal

const opts = {
  apiKey: "key",
}

const m = createOpenAI(opts)

const call = await streamText({
  model: m.chat("gpt-4o"),
  messages: [
    {
      role: "user",
      content: "tell me a short story",
    },
  ],
  abortSignal: abortSignal,
})
for await (const textPart of call.textStream) {
  abortController.abort()
  console.log(textPart)
}

Gives me this:

AbortError: The operation was aborted.
 code: "ABORT_ERR"

      at /Users/marko/Dev/Temp/bun-abort/test.ts:25:21
20 |       }
21 |     ],
22 |     abortSignal: abortSignal,
23 |   })
24 |   for await (const textPart of call.textStream) {
25 |     abortController.abort()
randompixel commented 1 week ago

@Jarred-Sumner I added the error callback exactly like the documentation and am still encountering the same issue. First time loads, silent crash on subsequent reloads.

Initially I was doing this in Elysia with a global error handler setup there before I made this minimal test case.

➜  app git:(bugfix/<branch>) ✗ bun run src/minimal.ts
21 |  * }
22 |  * ```
23 |  */
24 | class AbortError extends Error {
25 |     constructor(message) {
26 |         super(message);
             ^
AbortError: The operation was aborted.
      at new AbortError (<path>/cognitive-services-api/app/node_modules/@azure/identity/node_modules/@azure/core-rest-pipeline/node_modules/@azure/abort-controller/dist/commonjs/AbortError.js:26:9)
      at <path>/cognitive-services-api/app/node_modules/@azure/identity/node_modules/@azure/core-rest-pipeline/dist/commonjs/nodeHttpClient.js:282:25
      at abort (native:1:1)
      at <path>/cognitive-services-api/app/node_modules/@azure/identity/node_modules/@azure/core-rest-pipeline/dist/commonjs/nodeHttpClient.js:134:26

Bun v1.1.26 (macOS arm64)
➜  app git:(bugfix/<branch>) ✗

The commonality between my and @mjp0 's issue seems to be a dependency on the openai package, I wonder if there is something in the shims they use that is causing the issue.