stanford-crfm / helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).
https://crfm.stanford.edu/helm
Apache License 2.0
1.76k stars 233 forks source link

HELM for instruction-following models #1504

Open yifanmai opened 1 year ago

yifanmai commented 1 year ago

HELM was built during the era of few-shot in-context learning. The field is moving towards instruction-tuned models intended to be used in a zero-shot manner. We should update to HELM to support this.

yifanmai commented 1 year ago

Proposal Google doc