irthomasthomas / undecidability

12 stars 2 forks source link

Useful benchmarks that have human scores beyond AI SOTA. - Google Docs #954

Open ShellLM opened 2 hours ago

ShellLM commented 2 hours ago

Useful benchmarks that have human scores beyond AI SOTA

Snippet

Useful benchmarks that have human scores beyond AI SOTA.

Full Content

Useful benchmarks that have human scores beyond AI SOTA.

There are a number of important real-world benchmarks where human performance surpasses the current state-of-the-art (SOTA) in AI:

These benchmarks suggest that there remain significant gaps between current AI capabilities and human-level performance on many real-world tasks. Closing these gaps will be an important area of research going forward.

Suggested labels

None

ShellLM commented 2 hours ago

Related content

812 similarity score: 0.89

940 similarity score: 0.87

953 similarity score: 0.87

810 similarity score: 0.86

951 similarity score: 0.86

706 similarity score: 0.86