stanford-crfm / helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in HEIM (https://arxiv.org/abs/2311.04287) and vision-language models in VHELM (https://arxiv.org/abs/2410.07112).
https://crfm.stanford.edu/helm
Apache License 2.0
1.95k stars 254 forks source link

Safety Evaluation #1595

Open Mooler0410 opened 1 year ago

Mooler0410 commented 1 year ago

Evaluate safety issue of LLMs from various aspects

(To be continued, open by JingFeng, Hongye, ...)

yifanmai commented 1 year ago

We don't have "safety" as its own category in HELM right now, but some aspects of safety (fairness, bias, toxicity) are already included in HELM. Suggestions for new safety evals are welcome!