fchollet / ARC-AGI

The Abstraction and Reasoning Corpus
Apache License 2.0
3.31k stars 548 forks source link

In 3a301edc first example solution does not match the pattern observed in the other examples #124

Open petkish opened 2 months ago

petkish commented 2 months ago

In the first example (gray core 3x4, yellow border) the added gray border is 2 pixel thick, which does not match the relation between the core size to the added border thickness in the other examples, which is the equality relation. Correction possibilities:

  1. Either, correct the core to be 2x2, 3x3 or 4x4 and the added border thickness accordingly to 2, 3 or 4
  2. Or, leave the core 3x4, set the added border to changing thickness 3 in vertical parts but 4 in horizontal parts.
petkish commented 2 months ago

Screenshot_2024-07-03-15-01-27-072_com android chrome~2

yuedajiong commented 2 months ago

please consider boundary. :-)

petkish commented 2 months ago

yuedajiong: please consider boundary. :-)

Shall the boundary in the input field influence the thickness of the generated boundary in the output? If yes, I believe there are too few examples for this pattern.

yuedajiong commented 2 months ago

Even if the author has a god-level generation 'rule', it doesn't necessarily mean it is 'unique'. In fact, once given the correct answer, we can also 'reverse-construct several equally correct god-level generation 'rules', such as those that include or exclude 'boundary' factor.

If I have time, I will post an issue to analysis to thin dataset, in deep, focus on AGI tasks design.
That is, this is NOT a good (enough) AGI task design, it just focus on a small block (just part of abstraction and reasoning) of all necessary abilities. If no priori, STILL no algorithm(answer); please notice the word STILL. e.g: strong-priori-needed, no-random-factor, no-representation-requirement, iteration-in-algorihtm-is-not-necessary, ...

Of course, it is useful and good, just not good enough. I really appreciate this direction of AI task design, and personally, I think it is far more meaningful than training large language models.

petkish commented 3 weeks ago

Look. Try to do a simple exercise: We know the correct solution, right? Now, use it as one of the examples. Then, forget that the example 1 has its solution and try to solve it as _thetask. Do you have any problems? I did. So there is definitely something off with the example 1.

lenyabloko commented 2 weeks ago

I think ARC is not about "good enough" but about the best fit.

MischaMegens2 commented 1 week ago

I think the first example is odd, since it just complicates matters for no reason, without it the puzzle seems perfectly solvable. Of course I can come up with additional rules to 'fix' the first example to force it to fit in ('...but if the added border touches the edge then that limits its width', à la yuedajiong) but to me this is artificial. Now if there would have been an additional example showing what happens with a non-square central part when there is a sufficiently large border (A different thickness on the top/bottom vs left/right? Or a uniform thickness somehow related to the dimensions of the center?) then I would be more willing to accept example 1. But currently, it leaves me wondering what an asymmetric central part would normally do, yet brings no benefit towards solving the test. So I'm with petkish, this one is not so pretty.