mlcommons / modelbench

Run safety benchmarks against AI models and view detailed reports showing how well they performed.
https://mlcommons.org/ai-safety/
Apache License 2.0
62 stars 11 forks source link

Add switch for practice vs holdback prompts #620

Open wpietri opened 1 month ago

wpietri commented 3 weeks ago

Make sure not to change UIDs in ways that make modellab break. Or change modellab to match, either one.