Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).
[ ] RunSpec.annotators should be named RunSpec.annotator_specs
[ ] Annotators should support concurrency. AnnotatorFactory.get_annotator() should not enforce singletons. The image2structure annotators should be using locks instead. This is a correctness issue - currently, if multiple annotators with different arguments are instantiated, the run could get an annotator with the wrong arguments.
[ ] Annotator cache directory should be based on the name field, rather than the class name. For instance, the folder for LatexCompilerAnnotator should be latex_compiler, not latexcompiler.
[ ] Annotator frontend display should support most "simple" JSON objects, but currently only supports lists of dicts (and single dicts after #2700)
RunSpec.annotators
should be namedRunSpec.annotator_specs
AnnotatorFactory.get_annotator()
should not enforce singletons. The image2structure annotators should be using locks instead. This is a correctness issue - currently, if multiple annotators with different arguments are instantiated, the run could get an annotator with the wrong arguments.name
field, rather than the class name. For instance, the folder forLatexCompilerAnnotator
should belatex_compiler
, notlatexcompiler
.