Closed ghost closed 6 years ago
Wow, this sounds really interesting actually. So, please make a pull request so I can see the full diff! Beware, bit of a rant below about memory usage, that needs fixed :D
The line at:
Line 278: self->clauses.push_back(lits);
Uses too much memory. It would use at the very least, with 100M binary clauses (completely normal thing for large problems):
(64b2 (start+end ptr for the vector)+232b(for 2 Lits)) * 100M
Note that the same is stored as:
32b2 100M
in CMS. The difference is 3x memory usage. That's ignoring the malloc overhead and the minimum chunk size that malloc will allocate, so maybe you can add 20% to that without being too off. That's the difference of it fitting into memory in 4GB or not fitting into 12GB. Some users will be hugely upset :(
You see, SAT solving is hard not only because it's difficult to solve some CNF problems. It's difficult because you have to cater to CNFs that contain 500 variables, and also to ones that contain 500M variables. If you think I'm making this up, just look at this active issue open right now: https://github.com/msoos/cryptominisat/issues/445 It's a pretty funny issue but it's very typical :) Your change would not give us hope of fixing that issue when the solver is used from the python interface.
Can you please make a pull request so I can see the change? I have a feeling there are other, minor, issues but I can't see them right now. I can't see the DIFF. Please make a pull request :) We can discuss there, then you can push to your repo, thereby changing the pull request and then we can slowly move towards a solution :)
I am not trying to get your hopes down, I think this is a good effort -- but you need to think outside of your specific use-case and cater to people who use this system in many-many weird and unusual ways to solve their problems :) Remember, for them, the unusual is the norm :D
Here is how to add a pull request:
I am closing as the pull request failed because I was not convinced this was a clean enough solution with enough benefits. The DRAT concept still needs some thought.
Hey, I didn't realise this was still open after you closed the PR. We were very busy wrapping up this project so that's why I didn't follow up after the closing of the PR. Anyway we are very thankful that you took the time to explain why you didn't accept it and totally understand your concerns regarding memory consumption, code inefficiency... A lot of the changes we've done could (and probably should, for the most part) be left to the devs using the interface. As for the DRAT generation, I'm sure you'll come up with a better implementation than ours and find ways to test it in the test suite (which we didn't do).
Hey! Yeah, the issue is DRAT really -- the SAT community should take DRAT more seriously and commit more time to it. So that's what's holding us back right now. I actually appreciated what you did as you tried to pushed the limits of what's possible with DRAT and in effect found failings in it. Note that e.g. DRAT does not support library-based usage. That is not good. That's the use-case in many scenarios. The python interface sort of stretched that -- in a good way. I suggest you push your thought/idea as an article out there, or pitch it to Marijn Heule. It is important for the SAT community for the librarly-based version to work and I fully support that work.
As said in #454 we've been working on the python interface for the past two months as part of an assignment. The goal was to add DRAT capabilities to the interface.
We're almost there and would like to know if this is something people would be interested in. Here is a summary of what we have done :
drat
argument to thesolve
function to query the creation DRAT file. This argument is a string which represents the name of the file to be generated (in the same way the command line application works);setup_solver
function to generate the entire python object and not only the SATSolver. Before that, this object only contained one field. Now it contains vectors that represents the list of [xor_]clauses. It looked more natural to generate the python object this way than generating each field in their own setup function;Solver
object in a DIMACS file. This is needed if you want to verify your proof with DRAT-trim and have only generated the clauses through the interface;Of course some other functions have been added as a side effect in the process.
This ended up being a bit more than just adding the DRAT generation to the interface as we realized we needed some other features.
What still needs to be done :
nb_clauses
function is tested, which is... not much;Their's probably some work to do on the code as we're no experts in C++ let alone the implementation of Python bindings. We understand that this part of the project is not your primary focus and we'll have to work for our finals soon so it will probably take some time to clean up our repo and merge.
Improvements that could or will be done to our code :
rhs
variables). They may take a lot of memory and we don't always want to rebuild the DIMACS file from the object;As you may have realized, this is not a fast going project and we still have a lot of work to do on this. You can find our work in my repo here : https://github.com/MPaulmier/cryptominisat
If this is interesting to you let us know and we'll do our best to finish our work (we roughly have 2-3 weeks until due date) and clean the repo before creating a PR (though it will take some time as we approach the end of our year). If not that is also fine to us and it will sit in my repo :)