coin-or / python-mip

Python-MIP: collection of Python tools for the modeling and solution of Mixed-Integer Linear programs
Eclipse Public License 2.0
538 stars 95 forks source link

optimize n-queens: Performance of pre-compiled cbc vs source build #393

Open lrbison opened 1 month ago

lrbison commented 1 month ago

Describe the bug I find that build-from-source instructions result in a binary that is nearly twice as slow as the checked-in cbc-c-linux-x86-64.so.

My motivation in compiling form source is so that I can run on aarch64 host (Graviton3: aws c7g instance). The pre-compiled .so for cbc doesn't include aarch64 binaries. Testing showed I was getting poor performance, but when I take the same steps on x86 I find that none of the variations of flags or compilers I tried are able to match the pre-compiled binary, so this isn't so much a "slow on arm" problem as much as it is a "slow when I build from source" problem.

The test my customer pointed me to is the n-queens test. However they have modified it from your queens.py to include an optimize() call, which adds significant time. Does this make sense as a benchmark since solutions are either valid or not (but not "optimizable")?

To Reproduce I have tried two approaches:

I find that after including optimize in queens.py: for n=200, it takes about 60 seconds when I build from source on c6i (Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz) and about 20 seconds when I use the provided .so.

I found also https://github.com/coin-or/python-mip/issues/215#issuecomment-955792305 however I don't know which versions or flags I can use to reproduce the binary. It seems it was last updated 3 years ago, is it possible Cbc itself had a regression in that time or that I need to use particular versions? I see references in Benchmarks about " we implemented an automatic buffering/flushing mechanism in the CBC C Interface". Is this included in Cbc master now?

Expected behavior

Desktop (please complete the following information):

lrbison commented 1 month ago

We found the source of the difference in performance. The cause is the new default for CBC_PREPROCESS_EXPERIMENT set around the time of https://github.com/coin-or/CoinUtils/commit/293e6e981774ed047e8f00f7aff9252262f83a02 (Dec 2023). Resetting it back to 0 with -D in the CFLAGS improves performance of the n-queens call to optimize()

However I did check a few trials over a traveling salesman example, and those tests were unaffected by the CBC_PREPROCESS_EXPERIMENT macro.

Last: the Benchmarks page does not include the optimize() call but the Model Example n-Queens does. The second was originally used as the source of our benchmark. Is it appropriate to remove the call form the Model Example documentation?