stanford-crfm / helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).
https://crfm.stanford.edu/helm
Apache License 2.0
1.8k stars 240 forks source link

Add output_format_instructions run expander #2670

Closed yifanmai closed 1 month ago

yifanmai commented 2 months ago

The output_format_instructions run expander adds extra instructions to about output formatting to HELM Lite scenarios.

Many instruction-following models and chat models are tuned to expect conversational prompts and respond in a conversational way. These models occasionally produce outputs that are not in the expected format. This run expander instructs these models to provide the output in the format expected by the scenario.