UKGovernmentBEIS / inspect_ai

Inspect: A framework for large language model evaluations
https://inspect.ai-safety-institute.org.uk/
MIT License
586 stars 105 forks source link

[Feature Request] Add standardised way to index samples within task #702

Open skinnerjc opened 5 days ago

skinnerjc commented 5 days ago

Often I want to run a single Sample from a Task. e.g.:

Right now it is possible for a Task developer to enable indexing/filtering particular samples but if this isn't included at the time of development then there is no way to select particular Samples without a future change to each Task.

Would it be possible to add something equivalent to Sample indexing i.e. evaluating task[0:10] would evaluate only the first 10 samples.

jjallaire commented 4 days ago

You can currently do this using --limit. For example:

inspect eval ctf.py --limit 5      # first 5 samples
inspect eval ctf.py --limit 6-10   # samples 6-10

Does this do what you are looking for? (the eval() function also has a limit argument)

jjallaire commented 1 day ago

@skinnerjc can I close this? (does --limit do what you want?)