Closed andrewmatchday closed 2 weeks ago
Here are the options you can configure for TS evalution:
export interface EvaluateOptions {
/**
* The dataset to evaluate on. Can be a dataset name, a list of
* examples, or a generator of examples.
*/
data: DataT;
/**
* A list of evaluators to run on each example.
* @default undefined
*/
evaluators?: Array<EvaluatorT>;
/**
* A list of summary evaluators to run on the entire dataset.
* @default undefined
*/
summaryEvaluators?: Array<SummaryEvaluatorT>;
/**
* Metadata to attach to the experiment.
* @default undefined
*/
metadata?: KVMap;
/**
* A prefix to provide for your experiment name.
* @default undefined
*/
experimentPrefix?: string;
/**
* A free-form description of the experiment.
*/
description?: string;
/**
* The maximum number of concurrent evaluations to run.
* @default undefined
*/
maxConcurrency?: number;
/**
* The LangSmith client to use.
* @default undefined
*/
client?: Client;
/**
* The number of repetitions to perform. Each example
* will be run this many times.
* @default 1
*/
numRepetitions?: number;
}
Looks like the default is to not run them concurrently in TS
Thanks, I had assumed it was the same as runOnDataset
as maxConcurrency
was default undefined
for that as well
Thanks for raising though. I'm not sure why the default behavior was chosen to be different...
I have a dataset with 71 examples where I ran the same evaluations on using
evaluate
vsrunOnDataset
. The time it took for each was:runOnDataset
- 18 secondsevaluate
- 3 minutes 40 secondsFrom what I could tell from the documentation,
runOnDataset
is the older version and we should be usingevaluate
as it allows for experiment prefix and shows model, provider, revision id, etc. Is this the reason for why the await for the response ofevaluate
takes 10x longer? Is there some option I'm missing to run more evals concurrently?Suggestion:
No response