First, thanks for this awesome work and thanks for sharing!
I did a very simple test applying each verifier to its task and the checking that the returned output grid equals ground truth. A found few errors for the following tasks:
def check_verifiers():
with open("./data/arc-agi_training_challenges.json") as f:
data = json.load(f)
for key in data.keys():
in_fn = getattr(verifiers, f"verify_{key}")
cnt=0
for example in data[key]['train']:
ig = np_to_tuple(example['input'])
og = np_to_tuple(example['output'])
if not in_fn(ig) == og:
print(key)
print(cnt)
plot_task(data[key]['train'], f'task_{key}', True)
plot_task([{'input': ig, 'output': in_fn(ig) }], f'error_{key}_{cnt}', True)
cnt += 1
check_verifiers()
First, thanks for this awesome work and thanks for sharing! I did a very simple test applying each verifier to its task and the checking that the returned output grid equals ground truth. A found few errors for the following tasks:
6cf79266 -> 2 7e0986d6 -> 0 97a05b5b ->0 a64e4611 ->2 a8d7556c -> 2 e5062a87 -> 0 and 1
Code: