google / or-tools

Google's Operations Research tools:
https://developers.google.com/optimization/
Apache License 2.0
11.32k stars 2.14k forks source link

pyarrow python dependency breaks ortools #4285

Closed arnabanimesh closed 5 months ago

arnabanimesh commented 5 months ago

What version of OR-Tools and what language are you using? Version: v9.10.4067 Language: Python

Which solver are you using (e.g. CP-SAT, Routing Solver, GLOP, BOP, Gurobi) CP-SAT

What operating system (Linux, Windows, ...) and version? Windows

What did you do? Run this code once without installing pyarrow and another time after installing pyarrow using pip install:

"""This model implements a sudoku solver."""

from ortools.sat.python import cp_model
import pandas as pd

def solve_sudoku():
    """Solves the sudoku problem with the CP-SAT solver."""
    # Create the model.
    model = cp_model.CpModel()

    cell_size = 3
    line_size = cell_size**2
    line = list(range(0, line_size))
    cell = list(range(0, cell_size))

    initial_grid = [
        [0, 6, 0, 0, 5, 0, 0, 2, 0],
        [0, 0, 0, 3, 0, 0, 0, 9, 0],
        [7, 0, 0, 6, 0, 0, 0, 1, 0],
        [0, 0, 6, 0, 3, 0, 4, 0, 0],
        [0, 0, 4, 0, 7, 0, 1, 0, 0],
        [0, 0, 5, 0, 9, 0, 8, 0, 0],
        [0, 4, 0, 0, 0, 1, 0, 0, 6],
        [0, 3, 0, 0, 0, 8, 0, 0, 0],
        [0, 2, 0, 0, 4, 0, 0, 5, 0],
    ]

    grid = {}
    for i in line:
        for j in line:
            grid[(i, j)] = model.new_int_var(1, line_size, "grid %i %i" % (i, j))

    # AllDifferent on rows.
    for i in line:
        model.add_all_different(grid[(i, j)] for j in line)

    # AllDifferent on columns.
    for j in line:
        model.add_all_different(grid[(i, j)] for i in line)

    # AllDifferent on cells.
    for i in cell:
        for j in cell:
            one_cell = []
            for di in cell:
                for dj in cell:
                    one_cell.append(grid[(i * cell_size + di, j * cell_size + dj)])

            model.add_all_different(pd.Series(one_cell))

    # Initial values.
    for i in line:
        for j in line:
            if initial_grid[i][j]:
                model.add(grid[(i, j)] == initial_grid[i][j])

    # Solves and prints out the solution.
    solver = cp_model.CpSolver()
    status = solver.solve(model)
    if status == cp_model.OPTIMAL:
        for i in line:
            print([int(solver.value(grid[(i, j)])) for j in line])

solve_sudoku()

What did you expect to see Run properly both times

What did you see instead? Runs properly when pyarrow is not installed, but doesn't run when pyarrow is installed.

Make sure you include information that can help us debug (full error message, model Proto). NA

Anything else we should know about your project / environment NA

lperron commented 5 months ago

nothing I can do. This is the classic diamond dependency problem. The only viable option is to use the latest version (here of protobuf).

arnabanimesh commented 5 months ago

Which protobuf version should I use? I am already using v5.27.2.

lperron commented 5 months ago

At the time of the build, it was 26.1.

You can try rebuilding locally. Can you check which version is dragged by pyarrow ? Laurent Perron | Operations Research | @.*** | (33) 1 42 68 53 00

Le mer. 26 juin 2024 à 14:14, Arnab Animesh Das @.***> a écrit :

Which protobuf version should I use? I am already using v5.27.2.

— Reply to this email directly, view it on GitHub https://github.com/google/or-tools/issues/4285#issuecomment-2191539002, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUPL3NQOAGGCXV6NC6DWKTZJKWBJAVCNFSM6AAAAABJ5UJEOGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJRGUZTSMBQGI . You are receiving this because you modified the open/close state.Message ID: @.***>

arnabanimesh commented 5 months ago

I just checked out pyproject.toml of pyarrow. There is no mention of protobuf which means it is not using protobuf installed by pip. If the C++ library of arrow is using statically linked protobuf internally it should not affect the external libraries, right? The final bytewise representation of pd.Series should not depend on the protobuf version used by pyarrow (unless of course ABI of arrow library changes).

What I think is that the series produced by 'pandas' using the pyarrow engine is not supported by or-tools and there are no error messages reflecting the same.

arnabanimesh commented 4 months ago

This issue doesn't seem to appear on Linux (python 3.10 on Ubuntu 22 lts wsl2), only on windows (tried with same version). So I don't think the issue is with protobuf.