allenai / fm-cheatsheet

Website for hosting the Open Foundation Models Cheat Sheet.
https://fmcheatsheet.org
248 stars 18 forks source link

Intro Text for Eval Capabilities Page #28

Closed danmcduff closed 3 days ago

danmcduff commented 1 month ago

Replace

Many modern foundation models are released with general conversational abilities, such that their use cases are poorly specified and open-ended. This poses significant challenges to evaluation benchmarks which are unable to critically evaluate so many tasks, applications, and risks systematically or fairly. As a result, it is important to carefully scope the original intentions for the model, and the evaluations to those intentions.

With

Many modern foundation models are released with general abilities, such that their use cases are poorly specified and open-ended, posing significant challenges to evaluation benchmarks which are unable to critically evaluate so many tasks, applications, and risks systematically or fairly. It is important to carefully scope the original intentions for the model, and the evaluations to those intentions.

neural-loop commented 1 month ago

https://onm-demo.aimodels.org/foundation-model-resources/model-evaluation-capabilities/