Closed rlancemartin closed 7 months ago
Hi, @rlancemartin,
I'm helping the LangChain team manage their backlog and am marking this issue as stale. The issue requests the addition of a feature to run evaluation against a specific public dataset using a runnable, such as an RAG pipeline, to simplify the evaluation process. There has been no activity or comments on the issue yet.
Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days. Thank you!
Feature request
Given a runnable (e.g., RAG pipeline) automatically run eval against a given (public) dataset
Motivation
Made evaluation simple by abstracting w/ a template and passing in only a runnable.
(Template can be called as RemoteRunnable.)
Benchmark your runnable on an existing public LangSmith dataset.
Will differ depending on the use-case / dataset of choice.
Your contribution
Add template