langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
92.27k stars 14.74k forks source link

[Evaluation template] Runs evaluation against LangSmith (public) dataset #12742

Closed rlancemartin closed 7 months ago

rlancemartin commented 10 months ago

Feature request

Given a runnable (e.g., RAG pipeline) automatically run eval against a given (public) dataset

Motivation

Made evaluation simple by abstracting w/ a template and passing in only a runnable.

(Template can be called as RemoteRunnable.)

Benchmark your runnable on an existing public LangSmith dataset.

Will differ depending on the use-case / dataset of choice.

Your contribution

Add template

dosubot[bot] commented 7 months ago

Hi, @rlancemartin,

I'm helping the LangChain team manage their backlog and am marking this issue as stale. The issue requests the addition of a feature to run evaluation against a specific public dataset using a runnable, such as an RAG pipeline, to simplify the evaluation process. There has been no activity or comments on the issue yet.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days. Thank you!