Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).
Traceback (most recent call last):
File "/u/scr/maiyifan/miniconda3/envs/helm-2023-04-29/bin/helm-summarize", line 8, in <module>
sys.exit(main())
File "/u/scr/maiyifan/miniconda3/envs/helm-2023-04-29/lib/python3.8/site-packages/helm/common/hierarchical_logger.py", line 104, in wrapper
return fn(*args, **kwargs)
File "/u/scr/maiyifan/miniconda3/envs/helm-2023-04-29/lib/python3.8/site-packages/helm/benchmark/presentation/summarize.py", line 997, in main
summarizer.write_groups()
File "/u/scr/maiyifan/miniconda3/envs/helm-2023-04-29/lib/python3.8/site-packages/helm/benchmark/presentation/summarize.py", line 919, in write_groups
tables: List[Table] = self.create_group_tables_by_metric_group(group)
File "/u/scr/maiyifan/miniconda3/envs/helm-2023-04-29/lib/python3.8/site-packages/helm/benchmark/presentation/summarize.py", line 812, in create_group_tables_by_metric_group
table = self.create_group_table(
File "/u/scr/maiyifan/miniconda3/envs/helm-2023-04-29/lib/python3.8/site-packages/helm/benchmark/presentation/summarize.py", line 719, in create_group_table
self.create_cell(
File "/u/scr/maiyifan/miniconda3/envs/helm-2023-04-29/lib/python3.8/site-packages/helm/benchmark/presentation/summarize.py", line 540, in create_cell
description = aggregate_stat.bare_str()
File "/u/scr/maiyifan/miniconda3/envs/helm-2023-04-29/lib/python3.8/site-packages/helm/benchmark/metrics/statistic.py", line 59, in bare_str
f"min={process(self.min)}, "
File "/u/scr/maiyifan/miniconda3/envs/helm-2023-04-29/lib/python3.8/site-packages/helm/benchmark/metrics/statistic.py", line 53, in process
if int(x) == x:
ValueError: cannot convert float NaN to integer