Closed Ein-Tim closed 6 days ago
Converted to a discussion: https://github.com/openai/evals/discussions/391
As written by @andrew-openai in https://github.com/openai/evals/issues/632, I'm reopening this issue and have changed the title accordingly. Please also apply the https://github.com/openai/evals/labels/Idea%20for%20Eval label to it.
Eval description
An eval checking whether GPT-3.5 & GPT-4 can accurately match given lyrics to the song name.
Problem and motivation
After some testing with GPT-3.5 & GPT-4 (via ChatGPT+) I was honestly disappointed at how bad the LLM performed on these tasks, especially because a simple Google search of the lyrics nearly always brought up the correct song.
Examples
Example one: ❌❌
Example two: ❌❌
Example three: ❌✅
Example eval prompt
Is this something you're interested in working on
I'd really like to provide this eval however I currently neither have the time to do so nor sufficient technical skills. Thus, I'm sharing my idea here and hope that someone will provide a PR based on this idea!