vercel / ai

Build AI-powered applications with React, Svelte, Vue, and Solid
https://sdk.vercel.ai/docs
Other
9.96k stars 1.47k forks source link

StreamObject doesn't work with Anthropic #3422

Open Kitenite opened 5 days ago

Kitenite commented 5 days ago

Description

The streamObject function does not actually stream for anthropic. It resolves at the end instead. streamText works fine for anthropic unless a tool is required. It works fine for OpenAi as well. Tested with Haiku and Sonnet.

Related issue: https://github.com/vercel/ai/discussions/1980

Code example

import { AnthropicProvider, createAnthropic } from '@ai-sdk/anthropic';
import { createOpenAI, OpenAIProvider } from '@ai-sdk/openai';
import { CoreMessage, DeepPartial, streamObject } from 'ai';

//  Other code...

const model = this.anthropic(CLAUDE_MODELS.HAIKU);

// This streams fine
// const model = this.openai(OPEN_AI_MODELS.GPT_4_TURBO);

            const result = await streamObject({
                model,
                system: 'You are a seasoned React and Tailwind expert.',
                schema: StreamReponseObject,
                messages,
            });

// This actually wait the whole time, resolving everything at the end 

            for await (const partialObject of result.partialObjectStream) {
                console.log(partialObject)
            }

Additional context

Using main node.js process in an electron app

Kitenite commented 5 days ago

Seems like a related issue to this: https://github.com/vercel/ai/issues/3395

And related to this on Anthropic's side: https://github.com/anthropics/anthropic-sdk-typescript/issues/529

Seems like this is a limitation on Anthropic's side. I suppose this can be closed but I'll wait for a confirmation before doing so.

lgrammel commented 4 days ago

This is a long standing issue. I've explored it several times. We use tool calls and tool call streaming, because Anthropic does not support JSON outputs via options. The Anthropic API does "fake" tool call streaming, i.e. stream all chunks at once after a significant delay, leading to this effect.

Kitenite commented 4 days ago

Thanks @lgrammel , is there a good workaround to this using the AI SDK? I was able to get streamText to adhere to a format and use that instead but I really don't trust that.

Seems like the best course of action for me here is to use the anthropic SDK directly?

lgrammel commented 4 days ago

How would this work directly with the Anthropic SDK?

Kitenite commented 4 days ago

I just tested... Hubris on my side but it doesn't work like you mentioned. they would stream the text delta until the tool call and then big delay there until the entire call is resolved.

Please feel free to close and thanks for the quick reply :)

lgrammel commented 4 days ago

Want to leave this open since it comes up every week or so.

Kitenite commented 4 days ago

FYI for folks who absolutely have to use anthropic for streaming. This is my hacky solution which passes the zod schema as a system prompt to streamText then partially resolve the streamed object.

I'm surprised this works consistently with Claude Sonnet latest. It will try to wrap the object in a codeblock so I just strip it there.


import { createAnthropic } from '@ai-sdk/anthropic';
import { StreamReponseObject } from '@onlook/models/chat'; // Zod object
import { CoreMessage, DeepPartial, LanguageModelV1, streamText } from 'ai';
import { Allow, parse } from 'partial-json';
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';

// ...

    public async stream(
        messages: CoreMessage[],
    ): Promise<z.infer<typeof StreamReponseObject> | null> {
        try {
            const result = await streamText({
                model: this.model,
                system: 'You are a seasoned React and Tailwind expert.' + this.getFormatString(),
                messages,
            });

            let fullText = '';
            for await (const partialText of result.textStream) {
                fullText += partialText;
                const partialObject: DeepPartial<z.infer<typeof StreamReponseObject>> = parse(fullText, Allow.ALL);
               // Yay partial object!
            }

            const fullObject: z.infer<typeof StreamReponseObject>) = parse(fullText, Allow.ALL);
            return fullObject;
        } catch (error) {
            console.error('Error receiving stream', error);
            const errorMessage = this.getErrorMessage(error);
            this.emitErrorMessage('requestId', errorMessage);
            return null;
        }
    }

  getFormatString() {
        const jsonFormat = JSON.stringify(zodToJsonSchema(StreamReponseObject));
        return `\nReturn your response only in this JSON format: <format>${jsonFormat}</format>`;
    }

// ...
sahanatvessel commented 1 day ago

@lgrammel maybe add this to the docs for ai/sdk anthropic section? If I saw this in the docs I wouldn't have raised a bug last week 🙈