stanford-crfm / helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).
https://crfm.stanford.edu/helm
Apache License 2.0
1.8k stars 239 forks source link

Support text prompts for GPT-4o #2656

Closed yifanmai closed 2 months ago

yifanmai commented 2 months ago

Currently any OpenAIClient model that has VISION_LANGUAGE_MODEL_TAG cannot accept text prompts only. This is because OpenAIClient previously assumed that every model was either a text model or a VLM, but not both. This is no longer true: gpt-4-turbo-2024-04-09 and gpt-4o-2024-05-13 support both image and text inputs.

Example failure:

  File "/.../helm/src/helm/clients/openai_client.py", line 299, in make_request
    return self._make_chat_request(request)
  File "/.../helm/src/helm/clients/openai_client.py", line 170, in _make_chat_request
    cache_key = self._get_cache_key(raw_request, request)
  File "/.../helm/src/helm/clients/openai_client.py", line 64, in _get_cache_key
    assert request.multimodal_prompt is not None