mxbi / arckit

Tools for working with the Abstraction & Reasoning Corpus
Apache License 2.0
126 stars 17 forks source link

Data mismatch between AGI-ARC data on Kaggle and data provided in `arckit` #3

Closed jvallikivi closed 1 month ago

jvallikivi commented 2 months ago

I found a mismatch between a datapoint from the two data sources manually – there are potentially more mismatches.

Please see the output of the second training point 10fcaaa3. Value in arc1.json: [[0, 0, 6, 0, 0, 0, 6, 0], [8, 8, 8, 8, 8, 8, 8, 8], [0, 6, 0, 8, 0, 6, 0, 8], [8, 0, 6, 0, 8, 0, 6, 0], [8, 8, 8, 8, 8, 8, 8, 0], [0, 6, 0, 0, 0, 6, 0, 0]] Value in arc-agi_training_challenges.json (from Kaggle): [[0, 0, 6, 0, 0, 0, 6, 0], [8, 8, 8, 8, 8, 8, 8, 8], [0, 6, 0, 8, 0, 6, 0, 8], [8, 0, 6, 0, 8, 0, 6, 0], [8, 8, 8, 8, 8, 8, 8, 8], [0, 6, 0, 0, 0, 6, 0, 0]]

In order not to cause any issues, it might be good to either update arc1.json with the data from Kaggle and/or to create a function to easily import data downloaded from Kaggle.

astariul commented 2 months ago

Full list of mismatched data :

025d127b (training) 10fcaaa3 (training) 11852cab (training) 150deff5 (training) 1b60fb0c (training) 42a50994 (training) 469497ad (training) 6d0160f0 (training) 6d58a25d (training) 82819916 (training) 868de0fa (training) 9aec4887 (training) a9f96cdd (training) cbded52d (training) d511f180 (training) d687bc17 (training) dc433765 (training) e48d4e1a (training) ef135b50 (training) 12422b43 (evaluation) 17cae0c1 (evaluation) 310f3251 (evaluation) 423a55dc (evaluation) 48131b3c (evaluation) 54db823b (evaluation) 58e15b12 (evaluation) 7039b2d7 (evaluation) 8fbca751 (evaluation) a8610ef7 (evaluation) ad7e01d0 (evaluation) b0f4d537 (evaluation) b4a43f3b (evaluation) bd14c3bf (evaluation) c92b942c (evaluation) e7b06bea (evaluation) f8be4b64 (evaluation)

Note that I checked, and the data between Kaggle and the official ARC-AGI website is matching properly ! So it's just this repository that is outdated compared to the latest updates of the official data repository. @mxbi

mxbi commented 2 months ago

Hey both, thanks a lot for pointing this out. The data was updated for arc-agi. Will publish a fix in the next few days

mxbi commented 1 month ago

Hey, this has now been fixed in v0.1.0, which you can obtain with a pip install -U arckit. Thanks!