fchollet / ARC-AGI

The Abstraction and Reasoning Corpus
Apache License 2.0
3.6k stars 595 forks source link
artificial-intelligence intelligence-testing program-synthesis psychometrics

Abstraction and Reasoning Corpus for Artificial General Intelligence (ARC-AGI)

This repository contains the ARC-AGI task data, as well as a browser-based interface for humans to try their hand at solving the tasks manually.

"ARC can be seen as a general artificial intelligence benchmark, as a program synthesis benchmark, or as a psychometric intelligence test. It is targeted at both humans and artificially intelligent systems that aim at emulating a human-like form of general fluid intelligence."

A complete description of the dataset, its goals, and its underlying logic, can be found in: On the Measure of Intelligence.

As a reminder, a test-taker is said to solve a task when, upon seeing the task for the first time, they are able to produce the correct output grid for all test inputs in the task (this includes picking the dimensions of the output grid). For each test input, the test-taker is allowed 3 trials (this holds for all test-takers, either humans or AI).

Task file format

The data directory contains two subdirectories:

The tasks are stored in JSON format. Each task JSON file contains a dictionary with two fields:

A "pair" is a dictionary with two fields:

A "grid" is a rectangular matrix (list of lists) of integers between 0 and 9 (inclusive). The smallest possible grid size is 1x1 and the largest is 30x30.

When looking at a task, a test-taker has access to inputs & outputs of the demonstration pairs, plus the input(s) of the test pair(s). The goal is to construct the output grid(s) corresponding to the test input grid(s), using 3 trials for each test input. "Constructing the output grid" involves picking the height and width of the output grid, then filling each cell in the grid with a symbol (integer between 0 and 9, which are visualized as colors). Only exact solutions (all cells match the expected answer) can be said to be correct.

Usage of the testing interface

The testing interface is located at apps/testing_interface.html. Open it in a web browser (Chrome recommended). It will prompt you to select a task JSON file.

After loading a task, you will enter the test space, which looks like this:

test space

On the left, you will see the input/output pairs demonstrating the nature of the task. In the middle, you will see the current test input grid. On the right, you will see the controls you can use to construct the corresponding output grid.

You have access to the following tools:

Grid controls

Symbol controls

Answer validation

When your output grid is ready, click the green "Submit!" button to check your answer. We do not enforce the 3-trials rule.

After you've obtained the correct answer for the current test input grid, you can switch to the next test input grid for the task using the "Next test input" button (if there is any available; most tasks only have one test input).

When you're done with a task, use the "load task" button to open a new task.