Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in HEIM (https://arxiv.org/abs/2311.04287) and vision-language models in VHELM (https://arxiv.org/abs/2410.07112).
We don't have "safety" as its own category in HELM right now, but some aspects of safety (fairness, bias, toxicity) are already included in HELM. Suggestions for new safety evals are welcome!
Evaluate safety issue of LLMs from various aspects
(To be continued, open by JingFeng, Hongye, ...)