conjure-cp / conjure-oxide

Mozilla Public License 2.0
6 stars 8 forks source link

Discussion: Minion Bindings #1

Closed niklasdewally closed 7 months ago

niklasdewally commented 11 months ago

The purpose of this issue is to plan, document, and track the development of the minion bindings.

Any comments and feedback would be appreciated.

Design Plan

Introduction

As part of this project, we aim to provide native Rust interfaces to solvers. Currently, plain text is used to pass information between Conjure, Savile Row, and the underlying solvers. This is bad for performance, especially when many passes of a solver are being ran.

Minion is ran primarily through the passing of command line options. It takes input in its own, low level, model language.

My aim for the Minion bindings are to implement the functionality of the command line interface and model language as idiomatic Rust APIs.

I plan to take a layered approach to these bindings - first I will create thin wrappers around Minion C++ types, then I will use these to implement APIs to help create Minion models and change settings.

Options

Minion uses the SearchOptions and SearchMethod classes to control its behaviour.

I will combine these to create a MinionOptions struct, creatable using the Builder pattern. This will allow the multi-stage creation of MinionOptions, as well as handle the large number of optional settings nicely.

Models

Minion uses the CSPSpec class to hold Minion models.

I am not yet sure how to reimplement the model language as an API, but a binding of CSPSpec will be sufficient for testing purposes.

Running Minion through Rust

Minion uses a global variables to store SearchOptions and other state. Therefore, a new entrypoint to Minion will have to be written in C++, as, to my knowledge, global variables cannot be accessed through FFI.

This entrypoint will take in SearchOptions, SearchMethod, and CSPSpec, and roughly follow the same structure as Minions existing main function.


Tasks

[ ] Options: low-level binding of SearchOptions and SearchMethod [ ] Models: low-level binding of CSPSpec [ ] Options: MinionOptions and builder.

Models: design and implement model creation API.

ChrisJefferson commented 11 months ago

Hi!

I'm the original author of Minion, and in particular the bits you are trying to bind (although many other people have contributed, in particular to the large range of constraints in Minion). I'm currently in China, but I am very happy to meet online to chat, and also to answer any questions you might have about Minion.

You have correctly identified how Minion works -- the parser fills a CSPSpec, and then this is used to build the actual problem. The main question is if you should build a CSPSpec object, or instead directly call the lower level methods which are called when a CSPSpec object is built.

The advantage of skipping CSPSpec is that there are things that Minion can do, which you can't do from a CSPSpec object, like add new constraints or variables during search -- these limitations used to exist (and are partially why the CSPSpec object exists at all, because we used to want to build the whole CSP, and only then go and do the search).

One issue I do want to raise about Minion -- at the moment there are lots of global variables (as you noticed). This means you can only have one "Minion" at a time. At the moment there isn't even a way to reset a running Minion -- adding this wouldn't be too difficult, but adding the ability to have multiple Minions running at once would be quite a bit harder. If needing to run multiple copies of Minion at the same time was a requirement, then planning to run Minion as a separate process which you then communicate with might be a better idea (that would slow down communication, but would mean you could run lots of copies of Minion at the same time, doing different things).

In the distant past Minion did allow running multiple copies of itself, the code was long-since removed as it wasn't actually used, so it slowly broke to the point of not being fixable, but it is hypothetically re-creatable if required.

niklasdewally commented 11 months ago

Thank you for your feedback Chris!

As far as I understand it, with minion as a linked library, minions global variables would exist for the lifetime of my program - i.e. across invocations of minion? Therefore, I think writing functionality to reset global state would be highly helpful. @ChrisJefferson Is this something that would be useful upstream in the main minion codebase? (do other people use minion programmatically like I am planning to?)

If I am unable to add code to reset global state in minion, a backup option is to always an invocation of minion in a subprocess, isolating the global variable changes.

I think, for multiple invocations of minion, it is a good idea to fork into a new process to run the minion entrypoint, and then communicate the result over a channel / FIFO file. If the syntax in Rust is good for this, the output of many solvers could even be written to the same channel, and the main process can wait on this for results to come in.

These above cases, for simplicity, only consider the case where you fully define the model, then run the solver. I will focus on this for now, and potentially add the ability to update models during search later on. The concurrency for that would likely be a substantial project in its own right. (especially as I am not to familiar with concurrent programming).


niklasdewally commented 11 months ago

@ozgurakgun I presume I should focus on the simpler case of fully defining a model then running it first, or would building in this updating of the model while the search is running from the start be useful?

I presume the former is closer to how the other bindings would work?

ozgurakgun commented 11 months ago

I agree: the simple (one-shot) case first, and then we think about incremental use later.

We will want to make the SAT solver bindings incremental as well, and Ipasir somewhat supports this already. (cc Felix, though I don't know his Github username...)

niklasdewally commented 11 months ago

I agree: the simple (one-shot) case first, and then we think about incremental use later.

We will want to make the SAT solver bindings incremental as well, and Ipasir somewhat supports this already. (cc Felix, though I don't know his Github username...)

@ozgurakgun Felix's handle is @lixitrixi

lixitrixi commented 11 months ago

I agree: the simple (one-shot) case first, and then we think about incremental use later.

We will want to make the SAT solver bindings incremental as well, and Ipasir somewhat supports this already. (cc Felix, though I don't know his Github username...)

@ozgurakgun Sounds good! I'm curious as to what degree Ipasir supports changing the model during the search, is there a specific function you're talking about from the list? I know that some of the SAT solvers don't fully implement Ipasir; how could we approach adding this functionality for solvers that don't?

ChrisJefferson commented 11 months ago

The minion repository, master branch, now has --library flag, so doing:

mkdir bin; cd bin; ../configure.py --library; make will produce a libminion.a (along with still making the normal minion executable), which (should) be a useful static library for minion (although it is untested if the library actually does anything useful!).

niklasdewally commented 11 months ago

For next PR (as pointed out by felix):

niklasdewally commented 11 months ago

@ChrisJefferson I am trying to create functionality to reset all minion global state, and have a question about minion.

Is it correct that VARDEF() exists to define global variables? These are then defined once within globals.cpp. (The use of macro magic being avoiding multiple definitions of stuff, and changing the declaration to an extern when variables are already defined)

niklasdewally commented 11 months ago

For additional context, I am using clang-tidy's cppcoreguidelines-avoid-non-const-global-variables to find mutable globals, and most the things found seem to be defined with VARDEF.

It doesn't seem to find the stuff in StateObj.hpp due to clang not understanding it, but I presume those that are VARDEFed are globals too.

ChrisJefferson commented 11 months ago

Yes, the VARDEF just gives a way to define global variables.

(Writing this on my phone, so going from memory!)

Long ago there were huge numbers of global variables all over minion, they were slowly concentrated in one place.

It's possible since then new globals have been introduced. I would be happy to have PRs which merge them into the structs in stateObj.h, or just a list and I'll push them there myself.

Hopefully we can reach a point where all (or almost all) globals are accessed via a well defined list of functions in stateObj.h, at which point it will be much easier to reset them all.

To actually reset them, the easiest option would probably to be to introduce a global pointer (or a few global pointers), which can be initialized with 'new', and reset with 'delete' and 'new' -- this will, I think, be much easier than trying to set all the data structures back to their initial values, and should not effect any code outside stateObj.h, assuming everything uses nice functions to access the variables.

This can't be how minion normally works, as the cost of accessing via a pointer will create a small but measurable slowdown in minion (about 10% last time I checked). I'd consider this a reasonable overhead to make minion useful as a library, and we can use #ifdef to either define stateObj.h as normal, or as the special library version. Feel free to hack however you like, and we can worry about #ifdef ing later.

niklasdewally commented 11 months ago

It's possible since then new globals have been introduced. I would be happy to have PRs which merge them into the structs in stateObj.h, or just a list and I'll push them there myself.

Heres my list (combination of clang-tidy and git grep VARDEF):

globals_list.txt

``` ../minion/get_info.cpp:18:8: | EventNames ../minion/get_info.cpp:25:8: | EventCategory ../minion/get_info.cpp:34:8: | VarNames ../minion/get_info.cpp:36:8: | ConEventNames ../minion/get_info.cpp:41:8: | PropEventNames ../minion/get_info.cpp:65:9: | var_info ../minion/globals.cpp:18:8: | numOfConstraints ../minion/parallel.cpp:36:13: | checkIsAChildProcess ../minion/parallel.cpp:37:13: | forkEverCalled ../minion/parallel.cpp:41:12: | childTrackingPipe ../minion/system/trigger_timer.cpp:12:20: | trig ../minion/system/trigger_timer.cpp:13:20: | ctrlCPress ../minion/system/trigger_timer.cpp:15:6: | checkDoubleCtrlc /Users/niklas/src/minion/build/../minion/constraints/nonlinear_arithmetic/../../get_info/PropEvents.h:43:12: | CheckAssign /Users/niklas/src/minion/build/../minion/constraints/../queue/../triggering/../inputfile_parse/CSPSpec.h:102:22: | constraint_list /Users/niklas/src/minion/build/../minion/constraints/../queue/../triggering/../inputfile_parse/CSPSpec.h:590:15: | numOfConstraints /Users/niklas/src/minion/build/../minion/minion.h:25:17: | solsoutFile /Users/niklas/src/minion/build/../minion/search/../system/debug.h:57:6: | DOM_NORETURN /Users/niklas/src/minion/build/../minion/search/../system/minlib/json/picojson/picojson.h:781:22: | s /Users/niklas/src/minion/build/../minion/search/../variables/mappings/../../triggering/../inputfile_parse/../system/system.h:65:21: | global_random_gen /Users/niklas/src/minion/build/../minion/system/minlib/tries.hpp:35:6: | buildEarlyTrie /Users/niklas/src/minion/build/../minion/system/minlib/trigger_handling.hpp:23:16: | trigger_event_X StateObj.hpp | searchMem_m StateObj.hpp | options_m StateObj.hpp | state_m StateObj.hpp | queues_m StateObj.hpp | varContainer_m StateObj.hpp | bools_m StateObj.hpp | pardata_m ```

apart from those in StateObj.hpp, most of these seem either to be library code that I don't need to worry about, or simple ints and bools that can just be set to 0/ false.

Would the following work?

#ifdef IS_LIB
#define GET_GLOBAL_VAR(x) (global.x)
#else 
#define GET_GLOBAL_VAR(x) x
#endif

where global is a new created struct that has all the state structs as fields?

Would also need to redefine VARDEF and initialise global withifdef IS_LIB but this shouldn't be too hard.

niklasdewally commented 11 months ago

(I have implemented something similar to the above and it seems to compile. Will stick it in a PR once I add the reinitialisation in wrapper)

ChrisJefferson commented 11 months ago

Some thoughts (not sure if these all want to go here), also some of these might be worth thinking about for other solvers (particularly CP ones). Of course, interfaces can grow/shrink over time.

niklasdewally commented 11 months ago

Here are my initial thoughts, but I will discuss this with the Ozgur and the rest of the Rust stream for consistency between solvers, and to more generally decide how we will organise logging project wide:

In general, probably want a way of either (a) turning off everything Minion prints, and/or (b) redirecting the printing somewhere else?

I think it would be good to redirect stout and stderr within minion to a given filepath. This can be /dev/null by default, but set via API to any arbitrary path for debugging purposes. This would allow all the minion output to be reused.

Setting this to non null could also set the verbosity printing level to high. The main issue with this is that this verbosity option would have to persist beyond runs (currently it is in globals, so is reset between runs).

Alternatively, verbose could always be on, and just print to /dev/null. Then it can be pointed to a log-file when debugging is required. I suppose this would have some performance impact though.

At the moment, if Minion is given an invalid input file, it prints a nice message, then calls "exit(1);". Do we want to be able to (a) get the message, and (b) stop the whole program exiting?

This message could be captured via the mechanism above, but in addition to this, some kind of error code should be added to the entrypoint function to capture this programmatically.

niklasdewally commented 11 months ago

There isn't currently an easy way of getting the solutions out of Minion without writing them to a file then reading them back in. They could be stored in a big list somewhere, but lots of CP problems have VERY big sets of solutions, so probably want some way of registering a callback which lets us get solutions as they are generated?

A potential edge case (that may or may not be an issue), is when control returns back to user Rust code and Minion never finishes its search. Would running resetMinion as it currently exists allow minion to reset mid search?

niklasdewally commented 11 months ago

The above is all considering a more general online case in which minion is ran, reran, stopped and tweaked mid run. I think callbacks lends nicely to this.

ChrisJefferson commented 11 months ago

I think (without some very serious rewriting), the control flow will look like (none of this will be valid code or sensible interfaces!)

Write a function which looks something like: fn get_solution(x: Vec<i64>) -> bool { ... }

When you call minion, you can pass in get_solution, which will then be called each time a solution is generated. It is given the solution (or at least, some way of getting the solution), and returns a bool which means if search should continue.

If you return false (I'm not sure I love a bool here), then Minion exits search. Minion already has this basic architecture, as it calls a function each time it finds a solution to check if search should continue.

What might be nicer, but would be very hard (without some kind of clever trick) would be calling a function which returns a solution, and then lets you call it again for another solution, as that would require saving + restoring minion's call stack (you could in principle do this to minion, and all solvers I imagine, by pushing them into their own thread, but that's for another day).

niklasdewally commented 11 months ago

What might be nicer, but would be very hard (without some kind of clever trick) would be calling a function which returns a solution, and then lets you call it again for another solution, as that would require saving + restoring minion's call stack (you could in principle do this to minion, and all solvers I imagine, by pushing them into their own thread, but that's for another day).

Another way to do concurrency would be to have Minion return an iterator. Then, you can iterate over the results, with next() blocking as the results come.

In this case minion would be in a separate thread, and would have a callback that appends the result to the iterator. This however would mean that minion couldnt be stopped early by the rust program.

As I understand it what you proposed is same but making the iterator lazy not eagar - i.e. generate results when they are requested

niklasdewally commented 11 months ago

Regardless of if it ends up eager or lazy, an iterator would be a very nice interface to manipulate results

niklasdewally commented 11 months ago

When I say eager iterator above, I mean a stream or channel.

ozgurakgun commented 11 months ago

I think a stream/iterator can be built on top of a callback style interface anyway. Callbacks may be the most suitable for solvers in general (at the lowest level), but we will have to see as we create a few bindings.

niklasdewally commented 11 months ago

I think a stream/iterator can be built on top of a callback style interface anyway. Callbacks may be the most suitable for solvers in general (at the lowest level), but we will have to see as we create a few bindings.

Agreed. With callbacks, and solvers in seperate threads the logic for lazy iteration could be something as simple as:

fn my_callback (solution) {

   // put value into iterator somehow 

   // unlocked if the value has been read
   while lock == 1 { }

  return CONTINUE;

  }

The questions then are what to do with idle solvers that never finished, how to make this run without threading, and how to create the iterator in the first place, but I'm getting way ahead of myself.

niklasdewally commented 10 months ago

This is the roadmap for this area of the project up until the end of this semester.

Current Goal

The current (project wide) milestone is to be able to encode and solve the following constraints problem using Conjure Oxide (which we call the XYZ Problem):

In Essence:

find x, y, z : int(1..3)
such that x + y = z

In Minion:

MINION 3
**VARIABLES**
DISCRETE x #
{1..3}
DISCRETE y #
{1..3}
DISCRETE z #
{1..3}
**SEARCH**
PRINT[[x],[y],[z]]
VARORDER STATIC [x, y, z]
**CONSTRAINTS**
sumleq([x,y],z)
sumgeq([x,y],z)
**EOF**

For the Minion bindings, this can be decomposed into three sub-milestones:

  1. Being able to represent and solve the XYZ problem using Minion as a library, in C++.

  2. Being able to represent and solve the XYZ problem using the Minion bindings in Rust.

  3. Being able to represent and solve the XYZ problem using a solver-independent representation, which can be simplified to something that Minion can understand (replacing equals with <= and >=) and ran. (#15)

Tasks

Milestone 1: XYZ problem in C++.

Milestone 2: XYZ problem in Rust.

Milestone 3: A solver independent representation, compiled down to Minion.

See #15

niklasdewally commented 10 months ago

The problem has been altered slightly, an up to date copy can be found here: https://github.com/niklasdewally/conjure-bits/tree/main/xyz-notebook

niklasdewally commented 10 months ago

https://github.com/minion/minion/pull/14 creates the core library flow - write a model, run it, then get results out via callback (all in C++).

niklasdewally commented 10 months ago

Also got the example problem in C++ here: https://github.com/niklasdewally/conjure-bits/tree/main/xyz-minion-cpp

niklasdewally commented 10 months ago

Making and running a model in Rust: https://github.com/conjure-cp/conjure-oxide/pull/28

niklasdewally commented 9 months ago

Now got solution returning working: #55

niklasdewally commented 7 months ago

Keeping for historical purposes - see things tagged with areas::solvers/minion.