Issue: Langsmith fails to batch ingest runs

langchain-ai / langsmith-sdk

LangSmith Client SDK Implementations

https://smith.langchain.com/

MIT License

385 stars 71 forks source link

Issue: Langsmith fails to batch ingest runs #808

Open omrish-Glossai opened 3 months ago

omrish-Glossai commented 3 months ago

Issue you'd like to raise.

Hello, I am using LangSmith and Langchain to trace my LLM usage. While running several times I receive the following warning: langsmith.client:Failed to batch ingest runs: LangSmithConnectionError("Connection error caused failure to POST https://api.smith.langchain.com/runs/batch in LangSmith API. Please confirm your internet connection.. ConnectionError(ProtocolError('Connection aborted.', timeout('The write operation timed out')))")

When receiving this, the specific trace is not logged to the project.

Suggestion:

No response

hinthornw commented 3 months ago

Hi @omrish-Glossai could you confirm which langsmith version you are using?

We increased timeouts/persistence of the requests in more recent versions, which would reduce the occurrence of this type of connection error

omrish-Glossai commented 3 months ago

I am using langsmith 0.1.77.

dalmad2 commented 2 months ago

Has anyone figured this out?

Veghit commented 1 month ago

this happens using 0.1.199 as well any workaround? maybe changing the client configuration somehow?

ShubhamMaddhashiya-bidgely commented 1 month ago

I am also facing the same issue, I'm using langsmith==0.1.99

MichalKrakovskyBSS commented 3 weeks ago

Same here, any chance to change the config somehow?

hinthornw commented 3 weeks ago

Hi all this usually happens for one of two reasons:

Network issues
Payload is too large (> ~24 MB)

We've updated the error messaging recently to make it (hopefully) more clear which of the two is causing this, and we'll also try to make the load balancer erroring a bit more clear

hinthornw commented 3 weeks ago

To handle (2), we typically recommend excluding large, unhelpful content from traces: E.g.: https://docs.smith.langchain.com/how_to_guides/tracing/mask_inputs_outputs#rule-based-masking-of-inputs-and-outputs

Or in @traceable you can use process_inputs/outputs, or you can fully control with the trace context manager.

We're also working to help automate some of this and also increase the max-payload size permitted.

jeevanions commented 2 weeks ago

Not sure this is a solution but kind of worked around by changing this param

client = Client(auto_batch_tracing=False)
evaluate(....,client=client,...)

It takes more time though but finishes without any issues,.