google / or-tools

Google's Operations Research tools:
https://developers.google.com/optimization/
Apache License 2.0
10.77k stars 2.09k forks source link

Performance regression after 9.9.4133 (3194fd5) #4189

Open magneticflux- opened 2 months ago

magneticflux- commented 2 months ago

What version of OR-Tools and what language are you using? Version: main, 9ed0074090aacba704afd420234a6083e2644b5b, 3194fd5bb2017957dd3460719ba08affaa803511 Language: Java

Which solver are you using (e.g. CP-SAT, Routing Solver, GLOP, BOP, Gurobi) CP-SAT

What operating system (Linux, Windows, ...) and version? Linux

What did you do? Steps to reproduce the behavior:

  1. Solve the attached model using 9.9.4132 (9ed0074090aacba704afd420234a6083e2644b5b) (it takes ~20 seconds on an i9-10900K)
  2. Solve the attached model using 9.9.4133 (3194fd5bb2017957dd3460719ba08affaa803511) (it takes ~40 seconds on the same hardware)

What did you expect to see

log_9ed0074090aacba704afd420234a6083e2644b5b.txt

What did you see instead?

log_3194fd5bb2017957dd3460719ba08affaa803511.txt

The model I used for comparison: model.zip

Some differences I noticed in the logs:

fdid-git commented 2 months ago

I think there is a high variance on your model (like it is often the case, especially in multithread). If you try to solve it many times with different random seed, you can get various solve time. I attached an histogram of 1k runs with the latest version.

Histogram of walltime

Also, solving it with one thread, like with --params subsolvers:\"max_lp\" seems to work better now. So I don't think there is much we can do here.

magneticflux- commented 2 months ago

You're right, there's a lot of variance in my model. I've been using ministat to do t-tests comparing changes and updates and it seems like I got bizarrely unlucky with the first 6 samples. Then, I ran 3 more samples on another machine and it took ~2x the usual time again so I checked the commits and grabbed (what I thought were) representative logs.

I went and did some other stuff and just started a new run with 9.9.4133 now. The new linear propagation code seems to be slightly slower on average, and the standard deviation is much higher which I failed to account for in my initial assessment:

+---------------------------------------------------------------------------------------------------------+
|       +                                                                                                 |
|      ++                                                                                                 |
|      ++                                                                                                 |
|      ++                                                                                                 |
|     +++ +                                                                                               |
|     +++ +                                                                                               |
|     +++ +                                                                                               |
|     +++ +                                                                                               |
|     +++ + x                                                                                             |
|     +++ +xx                                                                                             |
|     +++ +xxx                                                                                            |
|     +++ +xxx                                                                                            |
|     +++ +xxx                                                                                            |
|    ++++ +xxx                                                                                            |
|    ++++ +xxx                                                                                            |
|    +++++**xx                                                                                            |
|    +++++**xx                                                                                            |
|    +++++**xx                                                                                            |
|    +++++***x                                                                                            |
|    +++++***x                                                                                            |
|    +++++***x                                                                                            |
|    +++++***x                                                                                            |
|    +++++***xx                                                                                           |
|    +++++***xx                                                                                           |
|    +++++***xx x                                                                                         |
|    +++++***xx x                                                                                         |
|    +++++***xx x                                                                                         |
|    +++++***xx x                                                                                         |
|    +++++***xx x                                                                                         |
|    +++++***xxxx                                                                                         |
|    +++++***xxxx                                                                                         |
|   ++++++***xxxx     +                                                                                   |
|   ++++++***xxx*     +                                                                                   |
|   ++++++***xxx*+    +   +                                                                               |
|   ++++++***x****+   + x +                                                                               |
|   ++++++********+   + x +         +                                                                     |
|   ++++++********+   + x +         +                                                                     |
|   ++++++*********   + xx+    ++   +                                                                     |
|   ++++++********** ++ xx+ +  ++  ++ +  +                                                                |
|   ++++++********** ++ xx+ +  ++  ++ +  + +                                                              |
|   ++++++********** ++ *x+ ++ ++ +++ +  + +                                                              |
|   ++++++********** +++**+++++++ ++++++ + ++                                                             |
|   ++++++********** +++**++++++++++++++ + ++                                                             |
|   ++++++********** +++**++++++++++++++ ++++        +                                                    |
|  +++++++**********x+++**++++++++++++++ +++++   +   +                                                    |
|  ++++++************+++**++++++++++++++ +++++  ++ + +                                                    |
|  ++++++************+++**+++++++++++++++++++++ ++++ +  +                                                 |
|  ++++++*************+***+++++++++++++++++++++ ++++++ ++                                                 |
|  ++++++*************+****++++++++++++++++++++ +++++++++   ++                                            |
| +++++++******************++++++++++++++++++++ +++++++++  +++    +                                       |
| +++++++******************++++++++++++++++++++++++++++++ ++++   ++    + +                                |
| +++++********************++++++++++++++++++++++++++++++ ++++++ +++ +++ +   +    +                       |
|++++++********************++++*++*++*+*+++++**+++++++++++++++++++++ +++++++ +++  +         +        +   +|
|         |___MA_____|                                                                                    |
|     |___________M_____A_________________|                                                               |
+---------------------------------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x 387       18.9284       31.4842        21.078     21.568593     1.7512407
+ 1000       16.9395       50.3212       22.5186     24.288804     5.7543101
Difference at 99.5% confidence
    2.72021 +/- 0.920084
    12.6119% +/- 4.26585%
    (Student's t, pooled s = 4.97378)

I'll leave it running to get a better estimate, and I might try testing a patch to change which vars are pushed first to see if that makes a difference. Thank you for taking the time to run an independent test, and sorry about the noise if this turns out to be benign!

fdid-git commented 2 months ago

Yes, I think the new code adds more "randomness" which explain the higher variance, especially if you test with different random seed. On another hand, the code should be more robust to future change in other places.