Add GPT-4o-mini benchmark

svilupp / Julia-LLM-Leaderboard

Provides a platform for the Julia community to compare AI models' abilities in generating syntactically correct Julia code, featuring structured tests and automated evaluations for easy and collaborative benchmarking.

http://svilupp.github.io/Julia-LLM-Leaderboard/dev

MIT License

65 stars 5 forks source link

Add GPT-4o-mini benchmark #26

Closed svilupp closed 3 months ago