the-crypt-keeper / can-ai-code

Self-evaluating interview for AI coders
https://huggingface.co/spaces/mike-ravkine/can-ai-code-results
MIT License
513 stars 29 forks source link

'senior' coder test suite #141

Closed the-crypt-keeper closed 8 months ago

the-crypt-keeper commented 8 months ago

The junior-v2 interview is showing it's age, I created it back when llama was all we had and at the time every single open source model failed the test.

The clustering we now see at the top of the leaderboard is a result of the massive improvements in open source coding models these past 6 months, anything above .95 is a binary pass and junior-v2 has no comparing ability up here.

A more difficult test suite is needed.

the-crypt-keeper commented 8 months ago

A senior interview suite mvp is now available, gpt4 can just barely pass it.

If you have any good ideas for interview questions please open PRs!