google-deepmind / alphageometry

Apache License 2.0
4.18k stars 469 forks source link

Some mistake in AR system coding #114

Open AnabelicSun opened 6 months ago

AnabelicSun commented 6 months ago

In ar.py line 354-361:

new_column = np.zeros([len(self.v2i), 2])  # N, 2
for v, c in vc:
  new_column[self.v2i[v], 0] += float(c) # [1]
  new_column[self.v2i[v], 1] -= float(c) # [2]

self.A = np.concatenate([self.A, new_column], 1) # add two columns
self.c += [1.0, -1.0] # coeeficients of linprog minimization objective 
self.deps += [dep]

it seems that self.c += [1.0, -1.0] shall be replaced by self.c += [1.0, 1.0]

Explaination:

The quoted code above is a part of the "Traceback for algebraic deduction", as mentioned in Nature volume 625, pages 476–482 (2024), i.e., the Google's AlphaGepmetry paper. Speaking more precisely, this "Traceback" is done by doing linear progression to minimize the function $\Sigma{i}(x{i}+y{i})$ (1) with restriction $A\circ(x-y)=b$, where $x{i}$, $y{i}$ are positive real numbers (storaged as float); $A$ is an $m \times n matrix$; $x$, $y$, $b$ are $n$-dimensional vectors. $m$ is the cardinality of the set of geometric variables used in linear deduction, $n$ is the cardinality of the set of rules (i.e., linear combination that sums to zero). (1) is apparently the Lasso regression. We may write this in a more understandable way: minimize $\Sigma{i}|z_{i}|$ (2) with restriction $A\circ z=b$.

In the code, [1] and [2] are adding the positive (i.e., vector x) and negative (i.e., vector y) version of a new rule; self.A is the matrix A, self.c is the goal of minimization. In the code the restriction is written rather as $A\circ x+ (-A)\circ y = b$, where "$A$" is part [1] and "$-A$" is part [2]. Therefore, the minimization shall be both "$1$" (i.e., "positive" minimization, do not require times $-1$) for [1] and [2].

The current code is minimizing $\Sigma{i}(x{i}-y{i})$ (3). Which does not make sense since if one increase both x{i} and y{i}, the restriction will still be satisfied, and the minimization goal is also unaffected. To compare with (2), one may translate this to minimize: $\Sigma{i}z_{i}$ (4), which is absurd.

\emph{Remark.} $\circ$ represents matrix multiplication.

bhwangfy commented 6 months ago

Hi, thanks for opening this issue. I am not familiar with the ar part, but the following two things seem to contradict each other: one is the codes work well on my machine, and the other one is what you are saying seems to be a big issue. Why is that?