the-crypt-keeper / can-ai-code

Self-evaluating interview for AI coders
https://huggingface.co/spaces/mike-ravkine/can-ai-code-results
MIT License
525 stars 30 forks source link

Different prompt for Code Millenials #149

Closed krzysiekpodk closed 8 months ago

krzysiekpodk commented 9 months ago

See this prompt for reference: https://github.com/BudEcosystem/code-millenials/blob/main/utils/prompt.py

I also double-checked asking in their HF repo.

However, this model will always produce space first in all my tests so, I'm not sure if it will work better with \n at the end or not (their eval prompt has \n)

PS. They report very high HumanEval but it didnt performed better than Wizard 33B 1.1, Deepseek coder 33B or Phind in my tests

the-crypt-keeper commented 9 months ago

Thanks for the issues! I'm doing some travelling this month, will give these a go in early February when I'm back.

the-crypt-keeper commented 8 months ago

@krzysiekpodk This prompt didn't seem to have much effect on the 34b, but really helped the 13b out to the point where it posted a higher score on senior then it's bigger brother. Also added the 1b and 3b variants that popped up since the last eval.

krzysiekpodk commented 8 months ago

This seems to be in accordance to my tests - something is wrong with this model or we don't know how to use it :(