openai / evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Other
14.35k stars 2.54k forks source link

Setting completion function args via CLI does not work #1504

Open LoryPack opened 3 months ago

LoryPack commented 3 months ago

Describe the bug

The response to issue #512 implemented a way to dynamically change API parameters (such as temperature) from the CLI (by looking at the code, the argument name has been changed to --completion_args). However, the arguments passed there do not seem to be used correctly.

To Reproduce

  1. Try running any eval, with a completion function corresponding to an openai model, by setting a parameter of the API, such as the temperature; for instance: oaieval gpt-3.5-turbo-0125 <eval_name> --completion_args 'temperature=0.5'
  2. The following error is raised:
    Traceback (most recent call last):
    File "/home/lorenzo/venv/recog-LLM_capabilities/bin/oaieval", line 8, in <module>
    sys.exit(main())
    File "/home/lorenzo/venv/recog-LLM_capabilities/lib/python3.9/site-packages/evals/cli/oaieval.py", line 277, in main
    run(args)
    File "/home/lorenzo/venv/recog-LLM_capabilities/lib/python3.9/site-packages/evals/cli/oaieval.py", line 146, in run
    completion_fn_instances = [
    File "/home/lorenzo/venv/recog-LLM_capabilities/lib/python3.9/site-packages/evals/cli/oaieval.py", line 147, in <listcomp>
    registry.make_completion_fn(url, **additonal_completion_args) for url in completion_fns
    File "/home/lorenzo/venv/recog-LLM_capabilities/lib/python3.9/site-packages/evals/registry.py", line 124, in make_completion_fn
    return OpenAIChatCompletionFn(model=name, n_ctx=n_ctx, **kwargs)
    TypeError: __init__() got an unexpected keyword argument 'temperature'

    The reason is that OpenAIChatCompletionFn is not called correctly, as its __init__ is the following:

    def __init__(
        self,
        model: Optional[str] = None,
        api_base: Optional[str] = None,
        api_key: Optional[str] = None,
        n_ctx: Optional[int] = None,
        extra_options: Optional[dict] = {},
    ):

Code snippets

No response

OS

Ubuntu 20.04

Python version

python 3.9

Library version

git+https://github.com/openai/evals.git@dd96814dd96bd64f3098afca8dc873aa8d8ce4c8