pysal / spopt

Spatial Optimization
https://pysal.org/spopt/
BSD 3-Clause "New" or "Revised" License
305 stars 46 forks source link

Facility Location modeling solutions & CI #189

Closed jGaboardi closed 3 years ago

jGaboardi commented 3 years ago

Migrated from gitter

@jGaboardi

So here are two notebooks I have been working on for benchmarking across solvers & pulp versions, one for pulp2.5.0 and one for pulp2.4:

Each tests:

  • 1 synthetic example and 1 empirical example (@huanfachen’s dataset)
  • LSCP, MCLP, p-median, and p-center
  • COIN-CBC, Cplex , GLPK, Gurobi, Mosek, and SCIP

The long and short of it is that the difference between pulp versions seems negligible, while the difference between solvers is where solution divergence actually occurs. Two big takeways are:

Precision is extemely important in network distance (where errors can progate over repeated truncation of distance measurements along segments) vs. euclidean distance. This appears to be a key difference in the p-center results. Location-allocation via selection sets are randonly seeded within each solver and can lead to a variety of layouts. This is simply a "fact of life” but can have real consequences on some types models, specifically ones that are more focused equity (LSCP, p-center).

So @germano, I believe we have done our due dilligence here and I am satisfied if you update the CI testing in #186 for the results we are seeing from pulp 2.5.0. Does that sounds good?

@gegen07

I feel ok to update the CI testing to pulp 2.5.0. I actually did this update in the last commit of gsoc21-facloc, but it gave an error. The macos test failed and ubuntu test, that is the same OS I use, passed. It's the same behavior that we found on Windows days ago, the array has different values compared to the expected array.

I was thinking in skip facility_client_array for macOS too and add test to the objective values for each model, since the mixins like Percentage and Mean Distance passed at the tests.

Reading the issue coin-or/Cbc#322, specifically this comment gave me an idea to use the argument gapRel=0.0001 to stop the fraction at a certain decimal point . We can so compare the objective values between OS without worries but I don't think that it would solve the problem with facility_client_array. What do you think?

jGaboardi commented 3 years ago

Regarding the facility_client_array, this is a tricky one... I think a good compromise would be for us to use the absolute most simple synthetic dataset(s) to test, which would produce identical selection sets across OSs. Maybe something like 5 clients and 3 facilities on a small 2x2 tic-tac-toe style lattice where.