stanford-crfm / helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in HEIM (https://arxiv.org/abs/2311.04287) and vision-language models in VHELM (https://arxiv.org/abs/2410.07112).
https://crfm.stanford.edu/helm
Apache License 2.0
1.94k stars 248 forks source link

Error when running OpenAI o1 series models #2995

Closed bryanzhou008 closed 1 month ago

bryanzhou008 commented 1 month ago

I'm getting the following error when running O1 series models, it seems that some of the utility calls cannot accept the new argument 'max_completion_tokens'? I could also be mistaken, not exactly sure why this is happening.


  File "/home/bryanzhou008/miniconda3/envs/helm/lib/python3.8/site-packages/retrying.py", line 251, in call
    attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
  File "/local1/bryanzhou008/eval/helm/src/helm/clients/auto_client.py", line 116, in make_request_with_retry
    return client.make_request(request)
  File "/local1/bryanzhou008/eval/helm/src/helm/clients/openai_client.py", line 320, in make_request
    return self._make_chat_request(request)
  File "/local1/bryanzhou008/eval/helm/src/helm/clients/openai_client.py", line 192, in _make_chat_request
    response, cached = self.cache.get(cache_key, wrap_request_time(do_it))
  File "/local1/bryanzhou008/eval/helm/src/helm/common/cache.py", line 216, in get
    response = compute()
  File "/local1/bryanzhou008/eval/helm/src/helm/common/request.py", line 249, in wrapped_compute
    response = compute()
  File "/local1/bryanzhou008/eval/helm/src/helm/clients/openai_client.py", line 188, in do_it
    return self.client.chat.completions.create(**raw_request).model_dump(mode="json")
  File "/home/bryanzhou008/miniconda3/envs/helm/lib/python3.8/site-packages/openai/_utils/_utils.py", line 277, in wrapper
    return func(*args, **kwargs)
TypeError: create() got an unexpected keyword argument 'max_completion_tokens'

        Request failed. Retrying (attempt #2) in 10 seconds... (See above for error details)```
yifanmai commented 1 month ago

This was fixed by #2989. It has been merged into main branch of HELM, but it is currently not in the latest PyPI package yet.

The new o1 models use max_completion_tokens instead of max_tokens in the request API, for reasons explained in this OpenAI doc.

bryanzhou008 commented 1 month ago

Yes, I actually ran into this error when using the latest Github version of HELM. After the openai_client adds the 'max_completion_tokens' argument to raw_request and calls self.client.chat.completions.create(**raw_request).model_dump(mode="json"), this results in the error described above. I'm not sure if there is anything else I need to change in adapter_spec.py or elsewhere for running o1 series models?

yifanmai commented 1 month ago

Could you provide instructions for reproducing this bug, ideally either a run entry or a Python script?

bryanzhou008 commented 1 month ago

Sure, here is a simple way to reproduce this bug:


git clone --recursive https://github.com/stanford-crfm/helm.git

# Create and activate the virtual environment.
conda create -n crfm-helm python=3.9 pip
conda activate crfm-helm

# Install helm dependencies
./install-dev.sh

# add openai api key to prod_env/credentials.conf

# Create a run specs configuration
echo 'entries: [{description: "mmlu:subject=philosophy,model=openai/o1-preview-2024-09-12", priority: 1}]' > run_entries.conf

# Run benchmark
helm-run --conf-paths run_entries.conf --suite v1 --max-eval-instances 10```
yifanmai commented 1 month ago

Could you update your openai library and try again:

pip install --upgrade 'openai>=1.45.0'

The o1 models require at least version 1.45.0 according to the changelog.

bryanzhou008 commented 1 month ago

It worked after updating the library, thanks much!

yifanmai commented 1 month ago

Welcome, glad that worked!