openai / evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Other
14.74k stars 2.58k forks source link

Eval making guide? #652

Open Luka5-A8ram opened 1 year ago

Luka5-A8ram commented 1 year ago

I am not proficiently in coding or using Github, but I would love to help making evals, sadly there is no coherent guide of how to.

Can someone please make an guide of how to make and post an eval?

lucianosb commented 1 year ago

I understand the feeling. I felt a bit confused with the instructions at first but ended looking into the available examples and following the logic described in the notebooks.

Hercules942 commented 1 year ago

Yeah, I've found it quite challenging to navigate myself. I believe OpenAI would benefit from a more clear and beginner-friendly introduction that would allow a large influx of useful evals that are currently being bottlenecked by programming / github knowledge.