google-deepmind / alphageometry

Apache License 2.0
4.19k stars 469 forks source link

Implementation of traceback for algebraic deduction #85

Open enlacroix opened 9 months ago

enlacroix commented 9 months ago

Intro

Greetings! My question is about the implementation of traceback for algebraic deduction. At the paper, p. 9, the authors use the approach of mixed-integer linear programming and apply this formula:

$$x, y = \min_{x, y} \sum_i (x_i + y_i) \ s.t. \ A^T(x-y) = b $$ Some explanations from the article:

Given the coefficient matrix of input equations A constructed as described in the previous sections and a target equation with coefficients vector $b ∈ R^N$, we determine the minimal set of premises for b by defining non-negative integer decision vectors $x, y ∈ Z^M$ and solve the following mixed-integer linear programming problem: The minimum set of immediate parent nodes for the equality represented by b will be the $i$ th equations ($i$ th rows in $A$) whose corresponding decision value $(x_i − y_i)$ is non-zero.

(Where $M$ - number of equations, $N$ - variables.)

This is an interesting approach! If possible, I would be very grateful if you would tell me from which sources you derived this formula and where I could read its justification.

Main question

To solve the linear programming problem, you are referring to Scipy (optimize.linprog). $$min(c^Tx) \ s.t. A{eq} \ x = b{eq}, \ x >= 0$$ In code:

# ar.py class Table why() method
x = optimize.linprog(c=self.c, A_eq=self.A, b_eq=b_eq, method='highs')['x']

You are faced with the task of how to reformulate the task in terms of scipy. So, you assume that $x_i = x_i - y_i$, but I will note that the non-negativity of x and y does not imply the non-negativity of their difference. Second task is to minimize the sum of $x_i$ and $y_i$, for this you need to choose the coefficients c. As I understand it, this is the only place where $c$ changes:

# ar.py class Table register() method
self.c += [1.0, -1.0]
# When 1 equation is added, 1 and -1 are written in c?

How the problem with the correlation of dimensions is solved? But if we write down the scalar product of c and x: $$c^T \cdot x = x_1 - y_1 - x_2 + y_2 + \dots $$ This is not sum, which we wanted to minimize. Please tell me where I made a mistake.