Open pranavguru opened 3 months ago
Identify tasks for each of the datasets to evaluate models on. For eg: Fineweb-edu is a large text corpus. What text-based task should a model be evaluated on after being fine-tuned on Fineweb-edu?
Done (https://docs.google.com/document/d/1oJkmtev7s6K-PqV5Ium9XVGC2wPcVxBUhvGj41BWlW8/edit), pending feedback from Benchmark experts and authors
Identify tasks for each of the datasets to evaluate models on. For eg: Fineweb-edu is a large text corpus. What text-based task should a model be evaluated on after being fine-tuned on Fineweb-edu?