Motivation: Spans a large range of difficulty, problems, and domains. A useful resource for evaluation as we don't have a clear understanding of the abilities and skills of extremely large LMs.
Note: it's a growing dataset (contributions are welcome), so we'll need careful versioning for this dataset.
Instructions to add a new dataset can be found here.
Adding a Dataset
Note: it's a growing dataset (contributions are welcome), so we'll need careful versioning for this dataset.
Instructions to add a new dataset can be found here.