langchain-ai / langsmith-sdk

LangSmith Client SDK Implementations
https://smith.langchain.com/
MIT License
370 stars 65 forks source link

Issue: Cost calculated for trace when using GPT4-Vision is wrong #460

Open Gr33nLight opened 6 months ago

Gr33nLight commented 6 months ago

Issue you'd like to raise.

Hello, I'm running the following configuration:

    const chat = new ChatOpenAI({
      modelName: 'gpt-4-vision-preview',
      streaming: true,
      maxTokens: 1024,
    }).withConfig({ runName: 'VisionChain' });

    const message = new HumanMessage({
      content: [
        {
          type: 'text',
          text: "...",
        },
        {
          type: 'image_url',
          image_url: {
            url: `data:image/jpeg;base64,${base64Data}`,
            detail: 'low',
          },
        },
      ],
    });

I suspect langsmith is counting the image passed as base64 as it was a normal string instead of being tread as an image for the vision api. A call that actually costed me 0.03 is being reported as $2.2. This can be also observed by the token count that takes in consideration the full base64 encoded string (which is not wrong per se). Display of the submitted image in the prompt is working as expected so thumbs up for that :)

Suggestion:

The cost should be calculated using the pricing model of the gpt-4-vision-preview and not the standard model.

hinthornw commented 6 months ago

Ah yes - we have a fix in the pipeline- thank you for flagging!

jonsoini commented 6 months ago

I'm seeing a similar issue with Gemini Pro Vision, no cost data is displayed, but the number of tokens for a request with an image are in the millions. Hoping the fix mentioned above covers this as well?

model = ChatVertexAI(model_name="gemini-pro-vision", max_output_tokens=2048, temperature=0.01)

    msg = model.invoke(
        [
            HumanMessage(
                content=[
                    {"type": "text", "text": prompt},
                    {
                        "type": "image_url",
                        "image_url": {"url": f"data:image/jpeg;base64,{img_base64}"},
                    },
                ]
            )
        ]
    )
plutasnyy commented 1 month ago

Do you have any update on this? I see in the output from OpenAI response['usage']['total_tokens']) values around 1k, while langsmith UI reports 50-200k tokens. Case for images encode to base64 and predicted by gpt-4o-mini

img_url = f"data:{example.media_type};base64,{example.base64_image}"
response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": system_prompt},
            {
                "role": "user",
                "content": [
                    {
                        "type": "image_url",
                        "image_url": {"url": img_url},
                    },
                    {"type": "text", "text": user_prompt},
                ],
            },
        ],