msoos / cryptominisat

An advanced SAT solver
https://www.msoos.org
Other
814 stars 178 forks source link

OpenCL support [RFE] #96

Closed woncount closed 11 years ago

woncount commented 11 years ago

Referring to issue #90

It's not yet multi-threaded.

I was going add a +1 and suggest Boost or OpenMP for simplicity.

However, given that support for parallelization and distribution are declared project goals (according to the website), I feel that it would be more future-proof to base threading support on OpenCL, which would also mean that CMS2 could make use of GPUs and dedicated FPGA hardware, including multicore hardware.

I am not sufficiently familiar with the code base, but maybe someone could provide regarding the hot spots of the program, to identify areas that would benefit from being moved to OpenCL space ?

OpenCL is close enough to C/C++, so that it should be fairly straightforward to identify algorithmic hot spots that could benefit from parallelization. Also, professional users are likely to have access to GPU/FPGA hardware, so that this would be a great feature, while also being totally future-proof.

Support for distributed SAT-solving could then be implemented by registering multiple instances and distributing work to different instances running in a clustered fashion.

Any other way of implementing multi-threading support would not automatically support parallelization and distribution. Also, OpenCL is fully cross-platform, and also supported on platforms that don't have GPUs or FPGAs, where it simply uses all available cores.

Thanks & all the best

For prototyping-purposes, see: https://github.com/vegard/clsat

woncount commented 11 years ago

Also see: https://lirias.kuleuven.be/bitstream/123456789/356591/1/bnaic_satsolver.pdf

hashcat supports OpenCL-based clustering using VCL: http://hashcat.net/wiki/doku.php?id=vcl_cluster_howto http://hashcat.net/oclhashcat-plus/ http://www.mosix.org/txt_vcl.html

capiman commented 11 years ago

Just one small hint: Have a look at cryptominisat-2.9.7. It is an older version of cryptominisat, which supported multiple threads. With cryptominisat 3.x a lot was changed and thread support is not yet complete (as far as i know from Mate).

Also have a look here: http://www.msoos.org/2012/01/a-nifty-way-of-saving-time-with-opencl/ http://www.msoos.org/2012/01/amds-opencl-heaven-and-hell/

msoos commented 11 years ago

OpenCL is most probably not what you meant. OpenCL is for putting stuff on the GPU. You probably meant OpenMP. CryptoMiniSat used to use OpenMP. As for algorithmic hot-spots, it's not so easy. Many have tried to do in-SAT parallelisation including me, and failed (I tried for a a full year to do this, it's not like I tried for 3 days). The best way seems to be to run many examples of the SAT solver in parallel with different configurations, and exchange information between them.

woncount commented 11 years ago

OpenCL is most probably not what you meant.

Actually, I did mean OpenCL, I am familiar with it - please see the links that I posted.

OpenCL is for putting stuff on the GPU. You probably meant OpenMP.

No, OpenCL is not just about GPUs, OpenCL code can also be run on FPGAs. And by default, it will fall back to the host computers SMP cores (see the AMD/Intel ICDs), i.e. OpenCL is a scalable way to achieve multithreading support for different devices: 1) multicore CPUs, 2) GPUs and 3) dedicated FPGAs. As you may have noticed, I also referred to OpenMP in my initial posting, so I'm aware of the differences. In fact, OpenCL can also be clustered, so that a heterogenous network of OpenCL hosts can run their computations in a MapReduce fashion.

The clsat project that I linked to is such an OpenCL-based SAT solver. Also, the papers I linked to are specifically about implementing SAT solving on top of OpenCL kernels.

The best way seems to be to run many examples of the SAT solver in parallel with different configurations, and exchange information between them.

See the PDF file, they also came to the same conclusion, i.e. that the DPLL algorithm must be reworked with parallelization in mind.

Thanks

msoos commented 11 years ago

From the PDF: "We have tested the OpenCL-SAT solver on 50 random 3-SAT instances from the DIMACS SAT library". Random 3-SAT is very far from application/industrial SAT that MiniSat, lingeling, riss, zchaff, CryptoMniSat, etc. are made for. I personally program in OpenCL quite a bit, and I couldn't think of a way to make it work. Can you give a high-level architectural diagram of what you thought about might work?

BTW, thread support will come to CMS 3.x, maybe soon :)

woncount commented 11 years ago

Thanks for looking into the files I linked to.

Regarding a high-level architecture, I was looking into clsat as a proof-of-concept for an OpenCL-based SAT solver: https://github.com/vegard/clsat

I realize that cryptominisat is much more sophisticated, but maybe it would be an option to provide a separate OpenCL-based mode where clsat would serve as the foundation, so that more stuff could be identified and re-implemented in the form of OpenCL kernels ?

The appealing thing about OpenCL is that it is scalable AND portable, i.e. not just like OpenMP or MPI, but that GPUs and FPGAs can also be used for running your kernels.

You mentioned previously that you looked into parallel SAT solving and encountered some serious showstoppers, how exactly are you hoping to address those once CMS supports threading (asking to better understand the underlying design)?

msoos commented 11 years ago

Vegard'c clsat I knew about first-hand, we played around with it when it was in development. I think it's very nice and a very good idea for making use of OpenCL in SAT -- but not industrial SAT, and Vegard would second me on that, I think. As for hashcat, that's MD5 hashing, i.e. crypto-on-FPGA/GPU/CPU which is what I have been working on for some while now, so I am familiar with it.

I know that OpenCL is very nice, and that's why I use it. I just wish AMD's OpenCL compiler wasn't so buggy (I have segfaults, miscompilations, slowdowns with optimizer turned on, crashes, everything you can imagine, it's like Swiss cheese).

As for threading. Showstopper was propagation that has to be done fast enough, most of the time. The other algorithms can be parallelised pretty well, though. However, since there are multiple algorithms, it would be expensive to do so especially since the most important part would not be parallel. So the solution is that of ManySat -- multiple full-fledged SAT solvers running in parallel, with their own datastructs. The only way they interact with each other is sharing unitary and binary clauses.

As for having OpenCL as part of CryptoMiniSat for stochastic local search -- yes, possible and probably a good idea. Unfortunately, I don't have enough time for that, as I have some ideas I want to do, and even those don't get to be implemented :( However, if you plan to work on that, I would be happy to help, and if the implementation doesn't change much in the core of the solver, I would be happy to merge your changes in, give credit, etc (as I have always done, you can ask @capiman ). If it works out, you can even publish, I would be happy to help with that too.

woncount commented 11 years ago

that sounds all promising - would you be willing to work out the details of having another OpenCL-based mode next to the default mode, without touching any of the existing logic/data flows ?

For instance, just what's needed - like parsing the CNF and exposing the data structure to a kernel.

I am thinking in terms of adding OpenCL support to the code base so that people not yet familiar with the code base could start playing around with different OpenCL kernels for experimental purposes.

Sort of like a framework for OpenCL-based SAT solving, all based on cryptominisat. For starters, that should be sufficient. Given the interest in parallel SAT-solving, having such a framework in place that would encourage experiments based on a scalable backend (OpenCL) would actually be awesome, and it would allow us to identify APIs that could be generalized/unified and shared eventually.

PS: Using NVIDIA's rocksolid drivers here, so not affected by AMD/ATI quirks (despite being amazed by AMD/ATI performance...) NVIDIA's OpenCL compiler uses LLVM, which compiles down to PTX assembly (CUDA kernels).

msoos commented 11 years ago

I would be happy if you did that, sure! I encourage you to work on it, and once you have something you wish to show, just drop me a mail and I'll take a peek :)