stanford-crfm / helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in HEIM (https://arxiv.org/abs/2311.04287) and vision-language models in VHELM (https://arxiv.org/abs/2410.07112).
https://crfm.stanford.edu/helm
Apache License 2.0
1.92k stars 246 forks source link

Faithfulness metrics - GPUs #687

Closed teetone closed 2 years ago

teetone commented 2 years ago

List RunSpecs that need GPUs at src/benchmark/presentation/run_specs_gpu.conf. I will try run those separately.

rishibommasani commented 2 years ago

@fladhak I also noted this in #747, but remember to add a conf like Tony mentions with just the summarization scenarios with the device=gpu set as the argument, and then we can keep the existing run_specs.conf the same.