ERGO-Code / HiGHS

Linear optimization software
MIT License
848 stars 164 forks source link

Fix the O(N^2) bug of passColName and passRowName #1780

Closed metab0t closed 1 month ago

metab0t commented 1 month ago

This fixes a performance issue when Highs_addCol and Highs_passColName are called in sequence to add new variables.

After Highs_addCol is called, there is n columns and the cols_hash_ also has n elements because of https://github.com/ERGO-Code/HiGHS/blob/13363c9f1252b015cf6527132eb9cf8f4b5bf020/src/lp_data/HighsInterface.cpp#L201 and https://github.com/ERGO-Code/HiGHS/blob/13363c9f1252b015cf6527132eb9cf8f4b5bf020/src/lp_data/HighsLp.cpp#L315

cols_hash_ will be rebuilt if it is empty.

Then, if Highs_passColName is called to set the name of column n, cols_hash_ is cleared.

Now when Highs_addCol and Highs_passColName are called in sequence to add N columns, cols_hash_ has to be rebuilt for N times with length 1,2,...,N, which makes the complexity $O(N^2)$.

In fact, when we call Highs_passColName, cols_hash_ can update one element instead of clearing all contents.

The bug is discovered in https://github.com/coin-or/python-mip/issues/372 .

jajhall commented 1 month ago

I need to investigate why unit tests fail before merging into latest. Don't create PR to master, please

metab0t commented 1 month ago

Sorry, I will choose latest as the merging branch for PR the next time.

jajhall commented 1 month ago

This fix fails when changing existing names in a model. I'm not fluent in the use of std::unordered_map, and am interested that

this->model.lp.colhash.name2index[name] = col;

works at all. The problem is that this retains the old name, so Highs::getColByName returns a column if the old name is specified as an argument, but that name is not correct for the column. Hence, with CMAKE_BUILD_TYPE=debug,

assert(lp.colnames[col] == name);

is triggered in Highs::getColByName, and if CMAKE_BUILD_TYPE=release, the unit test

// column num_col/2 is no longer called iCol_name status = highs.getColByName(iCol_name, iCol); REQUIRE(status == HighsStatus::kError);

fails.

The old name needs to be removed from the map, rather than (conservatively) clearing the map when passColName is called

I appreciate that we've got an $O(N^2)$ issue to resolve, but it needs a better fix. I'll create a HiGHS issue and fix it myself in due course, unless you want to have another go.

Do compile with FAST_BUILD=off and run ctest if you're proposing a code change

metab0t commented 1 month ago

It is more complex than I thought.

It should be two steps:

  1. Delete the old name from the name2index
  2. Mark kHashDuplicate if new name already maps to a column
jajhall commented 1 month ago

I think this should do it.