UKGovernmentBEIS / inspect_ai

Inspect: A framework for large language model evaluations
https://inspect.ai-safety-institute.org.uk/
MIT License
565 stars 96 forks source link

inspect eval with task name - IndexError: list index out of range #36

Closed tekumara closed 3 months ago

tekumara commented 3 months ago
❯ inspect eval security_guide --model bedrock/meta.llama2-70b-chat-v1
Traceback (most recent call last):
  File "/Users/tekumara/code3/inspect_ai/.venv/bin/inspect", line 8, in <module>
    sys.exit(main())
  File "/Users/tekumara/code3/inspect_ai/src/inspect_ai/_cli/main.py", line 45, in main
    inspect(auto_envvar_prefix="INSPECT")
  File "/Users/tekumara/code3/inspect_ai/.venv/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/Users/tekumara/code3/inspect_ai/.venv/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/Users/tekumara/code3/inspect_ai/.venv/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/tekumara/code3/inspect_ai/.venv/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/tekumara/code3/inspect_ai/.venv/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/Users/tekumara/code3/inspect_ai/src/inspect_ai/_cli/common.py", line 46, in wrapper
    return cast(click.Context, func(*args, **kwargs))
  File "/Users/tekumara/code3/inspect_ai/src/inspect_ai/_cli/eval.py", line 238, in eval_command
    eval(
  File "/Users/tekumara/code3/inspect_ai/src/inspect_ai/_eval/eval.py", line 102, in eval
    return asyncio.run(
  File "/opt/homebrew/Cellar/python@3.10/3.10.14/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/opt/homebrew/Cellar/python@3.10/3.10.14/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/Users/tekumara/code3/inspect_ai/src/inspect_ai/_eval/eval.py", line 219, in eval_async
    sample_source = eval_log_sample_source(eval_log, eval_tasks[0].dataset)
IndexError: list index out of range

These docs suggest this should work.

Works fine with:

inspect eval examples/security_guide.py --model bedrock/meta.llama2-70b-chat-v1
jjallaire commented 3 months ago

Those docs don't assume that you are in the inspect_ai repo, they are just generically about selecting models w/ a given (fictional) eval target. Probably would be better if we used something fully fictional though not one of our own examples!

tekumara commented 3 months ago

Hi @jjallaire I can reproduce this outside the inspect_ai repo when trying to run an eval using it's task name rather than python file name.