Automatic solver [discussion]

@dclamage @degustaf Thanks for your time on this. Of course, others are welcome to chip in if they find this discussion and have some ideas of their own.

Introduction

Originally, I was going to lay down my vision of how constraints will integrate with the rest of the system for latgrid and all of the features I desire from it, but I think the best approach will be to brainstorm ideas separately and then eventually adapt those ideas into its final form.

However, it cannot be developed or discussed in a vacuum. Constraints in latgrid need three primitives for two different tasks that can have a similar purpose.

Simple error highlighting (henceforth, the constraint checker):
- Error checking of invalid situations, e.g. in sudoku, two identical numbers in the same house (row/column/box) are invalid.
- Preventing incomplete solutions from counting as a solution. A sudoku is not complete until every cell is filled with a digit.
Automatic solving (auto-solver):
- The setter will likely need to provide hints to the solver to do better than simple bifurcation. While giving a minimum ruleset can constrain the puzzle enough to have one solution, other ways of expressing those constraints might help the auto-solver.

Since most interesting puzzles are NP-Complete, validating the solution can be much faster than finding the solution in the general case. Therefore, in my opinion, it is more important to put markInvalid() and markIncomplete() primitives front and center before any solverHint() primitives that are exclusively for the auto-solver.

Ideally though, the same DSL would be able to program both of these categories of constraints since they would share some of the same boilerplate setup.

Problem description

Here is the path a puzzle setter would take to make a puzzle with some constraints. They would pick a set of drawing tools, called "layers" in latgrid, to make the visual components of the puzzle. As part of this, they would set the controls for the different layers restricting what objects are possible. For example, can lines be drawn diagonally or only orthogonally or only using knight's moves? Alongside this, they would choose the grid type (square, hex, etc.) and the grid size. All of these are implicit constraints on the solution to the puzzle and must be understood to some extent by the constraint checker and the auto-solver.

Here's some psuedo-code describing the implicit and explicit constraints for a sudoku puzzle. There are two strategies for enforcing the rule "all rows, columns, and boxes have the digits 1 to 9". The first will only disallow bad solutions to the puzzle while the second strategy uses markInvalid() to give immediate feedback for when the solver makes a mistake and puts two digits in the same house.

# Implicit constraints enforced by controls and givens
grid.height = grid.width = 9
grid.layers = [number_layer]
number_layer.possible_values = range(1, 9 + 1)
number_layer.givens = ...

# Explicit constraints put by the setter
houses = concat(grid.rows, grid.columns, regions)

for house in houses:
    numbers = number_layer.filter(number.cell in house.cells)

    # Strategy 1: each house must have 9 unique values, no feedback as to what makes a bad solution wrong
    if len(set(numbers)) != 9: markIncomplete(house.cells)

    # Strategy 2: each house must have 9 values and they must not repeat, gives basic feeback
    if len(numbers) != 9: markIncomplete(house.cells)
    number2cells = numbers.groupby(numbers.values)
    for number_value, cells in number2cells.items():
        if len(cells) > 1:
            markInvalid(cells)

    # Hints for the solver can be written alongside the other types of
    # constraints, but will inevitably have a different 'feel' to the API.
    solver.distinct(numbers)

The exact code doesn't matter since the constraints will most likely be inputted using a drag and drop interface or a form of some sort. Additionally, the API available must be a high-level representation of the capabilities of the underlying system, which requires solidifying how that system works.

Auto-solver requirements (a wish list, really)

The DSL used to program the auto-solver must be expressive enough to solve many problems, while intuitive enough (or at least with an intuitive facade) for setters to use it.
It should work in step mode during setting and work in completion mode when testing solutions.
It should run logic in batches so the program could be polled to make sure it's not stuck in an infinite loop.
Ideally, it should allow new information/constraints during solving to allow checking for multiple solutions, or even to allow multiple solver strategies to work in cooperation.

Yes, some of these ideas are blatantly borrowed from SudokuSolver and other similar tools. Some requirements might be incompatible with others. The off-the-shelf solvers that I know of don't seem to have all of these features, but I should list them anyways.

Ideas

@dclamage, you started to mention strong and weak links? I will go through sudokuwiki to refamiliarize myself with that nomenclature.

I have an idea using network building that could work as a basis for the auto-solver, but I need to go through my old notes and solidify its parts for how it will work before I share it here.

With all that said, I'm open to discuss ideas for how the auto-solver would work. I was just going to build constraints on top of a SAT or SMT solver, at least for the first version (I probably still will), but I'm intrigued in any other options you know about.

Many thanks for your time.

nmay231 / lattice-grid