msoos / cryptominisat

An advanced SAT solver
https://www.msoos.org
Other
814 stars 178 forks source link

Python API Suggestion - high/low level bindings #547

Closed cipherboy closed 5 years ago

cipherboy commented 5 years ago

Per discussion on #543... I thought I'd make the following proposal.

The pycryptosat module should have two classes: Solver, the existing high-level interface, and a new RawSolver, CSolver, CMSat, SATSolver (or similar name, whatever you think appropriate), which has low-level bindings.

The new "low-level" bindings should aim for direct compatibility with the C++ API, that is, duplicate all relevant methods (so without experimental methods and without add_in_partial_solving_stats...). Then "experts" can play with the low-level bindings but most users will gravitate towards the existing Solver API.

The existing behavior of Solver should be preserved and perhaps rewritten in Python, else the current implementation could simply preserved along side the RawSolver.


Another thought about structure could be keeping Solver as a class, but having raw be a submodule (pycryptosat.raw) with methods which operate on Solver and call into the C API. This way backwards compatibility is preserved (high level functionality is provided by Solver) and we have a way of applying low-level / directly bound methods, at the expense of making it more procedural than object oriented (pycryptosat.raw.get_sum_conflicts(MySolver)), but with the benefit of not modifying existing Solver code and not introducing another CMSat::SATSolver wrapper.

Happy to hear your (and the community's) thoughts about this.

msoos commented 5 years ago

Hi,

I'm a bit confused. What is the business case? What does this technical change intends to solve? Technology is only useful for achieving some kind of aim, it is not an aim in itself. I don't understand what actual user value this will bring.

cipherboy commented 5 years ago

Well, the issue with the existing Python API is that it doesn't directly correspond (nor seems meant to?) the C++ API. The relevant functionality I'm wanting (and happy to submit PRs for!) are:

  1. Conflicts, fixed by #544.
  2. max_vars, discussed in #543, whose status is pending.
  3. Dumping CNF (dump_irred_clauses / open_file_and_dump_irred_clauses).
  4. Solver stats (get_sum_conflicts / get_last_conflicts / ... would be interesting to have metrics for initial / incremental solves). ~5. Interrupting the solver (interrupt_asap is something I'd like to experiment with when solving parallel instances on systems).~

Now, a few of these I'm guessing you're going to make a similar argument for as #543... The business/research case is, for the few experts out there, (or, those of us who like to play around a bit too much... :), we (or just me..?) want a way of accessing the C++ API from higher level languages. Sure, there's a language translation cost, but the solving proper is likely the slow part (+/- creating the CNFs) so there's bound to be some of us wanting to use the Python API and access a few more methods from the C++ API. (Point 3 above is especially relevant for research and reproduciblity / responsible data preservation, 1 ~and 5~ are me wanting to experiment with incremental solving, 4 is a metric that is useless in isolation, but could be interesting in bulk).

From your end, you probably don't want to maintain 1:1 Python to C++ API binding directly in the high level API for a couple of reasons:

  1. As a high-level interface, you probably want to obscure some of the details of the low level API and handle it for the user. Things like passing sets/lists/array.array etc. into add_clauses.
  2. You want to prevent some easily abused features (like max_clauses, get_{sum,last}_*, etc.) from being abused.
  3. Some details of the C++ API shouldn't be exposed (e.g., std::vector<Lit>) and others are of limited utility to the bulk majority of users (e.g., sqlite export).

That's why I'm proposing separating the easily discoverable high-level Solver API from a raw / warning-heavy raw API.


Perhaps I'm the first person to want to code in Python (likely due to my existing code base) and access more of the C++ API, hence why I'm adding/wanting these methods. And my business/research case is invalid.

The value to the average user is none (backwards compatibility should be preserved). The value to a researcher / etc. is that (with a proposed pycryptosat.raw submodule) we can call into the parts of the C++ API from Python we want to. max_vars, conflicts, etc. I'm wanting a subset of the missing C++ API calls, but willing to do more work for others if they're interested in having it (sqlite, *dump_red_clauses, configuration parameters, etc.).


(Yeah, I think I'm leaning towards a pycryptosat.raw submodule than a separate class now that I've posed it... :).

msoos commented 5 years ago

OK, I see. You can go ahead, however:

1) There seems to be an overhead in me deciding to maintain this. Please make the code nice (not full of comments, just nice, easy to understand, clear, etc). I will have to be able to fix it if any bugs are found

2) Please add lots of test cases for all the options, as without a test case it's hard to decide if it's working or not when things change. It also helps in maintenance :)

3) The main interface must not be any slower due to this -- there are high-performance systems (conda/anaconda) using that interface. Loosing even 5% of speed there would be a major no-no for them. So I want to make sure this leads to absolutely no performance degradation whatsoever for those using the main interface.

Thanks a lot in advance for your work!

mate

cipherboy commented 5 years ago

Understood :) Not sure the timeframe on this, but you'll know when I have something concrete.