atoptima / Coluna.jl

Branch-and-Price-and-Cut in Julia
https://www.atoptima.com
Other
193 stars 43 forks source link

Invalid variable index #489

Closed hhkramer closed 3 years ago

hhkramer commented 3 years ago

Describe the bug When using Coluna to solve my application I get the following error after some nodes of the branch-and-price tree:

ERROR: LoadError: The index MathOptInterface.VariableIndex(-1) is invalid. Note that an index becomes invalid after it has been deleted.

I am using the default TreeSearchAlgorithm params except that Primal Heuristics are disabled.

To Reproduce Coluna parameters are set as follows:

coluna = optimizer_with_attributes(
        Coluna.Optimizer,
        "params" => Coluna.Params(
            solver = Coluna.Algorithm.TreeSearchAlgorithm(
                conqueralg = Coluna.Algorithm.ColCutGenConquer(
                    primal_heuristics = Coluna.Algorithm.ParameterisedHeuristic[])
            )
        ),
        "default_optimizer" => Gurobi.Optimizer
    )

Expected behavior

ERROR: LoadError: The index MathOptInterface.VariableIndex(-1) is invalid. Note that an index becomes invalid after it has been deleted.
Stacktrace:
 [1] _info(::Gurobi.Optimizer, ::MathOptInterface.VariableIndex) at C:\Users\hugoh\.julia\packages\Gurobi\qk7lG\src\MOI_wrapper.jl:654
 [2] column at C:\Users\hugoh\.julia\packages\Gurobi\qk7lG\src\MOI_wrapper.jl:665 [inlined]
 [3] modify(::Gurobi.Optimizer, ::MathOptInterface.ObjectiveFunction{MathOptInterface.ScalarAffineFunction{Float64}}, ::MathOptInterface.ScalarCoefficientChan
ge{Float64}) at C:\Users\hugoh\.julia\packages\Gurobi\qk7lG\src\MOI_wrapper.jl:3087
 [4] update_cost_in_optimizer!(::Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster}, ::Coluna.MathProg.Variable) at C:\Users\hugoh\.julia\packages\Coluna\u
QdsR\src\MathProg\MOIinterface.jl:49
 [5] sync_solver!(::Coluna.MathProg.MoiOptimizer, ::Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster}) at C:\Users\hugoh\.julia\packages\Coluna\uQdsR\src\
MathProg\optimizerwrappers.jl:70
 [6] optimize_with_moi!(::Coluna.MathProg.MoiOptimizer, ::Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster}, ::Coluna.Algorithm.OptimizationState{Coluna.M
athProg.Formulation{Coluna.MathProg.DwMaster},Coluna.MathProg.MinSense}) at C:\Users\hugoh\.julia\packages\Coluna\uQdsR\src\Algorithm\basic\solveipform.jl:137
 [7] optimize_lp_form!(::Coluna.Algorithm.SolveLpForm, ::Coluna.MathProg.MoiOptimizer, ::Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster}, ::Coluna.Algor
ithm.OptimizationState{Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster},Coluna.MathProg.MinSense}) at C:\Users\hugoh\.julia\packages\Coluna\uQdsR\src\Alg
orithm\basic\solvelpform.jl:50
 [8] macro expansion at C:\Users\hugoh\.julia\packages\Coluna\uQdsR\src\Algorithm\basic\solvelpform.jl:73 [inlined]
 [9] macro expansion at C:\Users\hugoh\.julia\packages\TimerOutputs\4QAIk\src\TimerOutput.jl:190 [inlined]
 [10] run!(::Coluna.Algorithm.SolveLpForm, ::Coluna.Env, ::Coluna.Algorithm.ModelData, ::Coluna.Algorithm.OptimizationInput{Coluna.MathProg.Formulation{Coluna
.MathProg.DwMaster},Coluna.MathProg.MinSense}) at C:\Users\hugoh\.julia\packages\Coluna\uQdsR\src\Algorithm\basic\solvelpform.jl:58
 [11] macro expansion at C:\Users\hugoh\.julia\packages\Coluna\uQdsR\src\Algorithm\colgen.jl:654 [inlined]
 [12] macro expansion at .\timing.jl:233 [inlined]
 [13] cg_main_loop!(::Coluna.Algorithm.ColumnGeneration, ::Coluna.Env, ::Int64, ::Coluna.Algorithm.OptimizationState{Coluna.MathProg.Formulation{Coluna.MathPr
og.DwMaster},Coluna.MathProg.MinSense}, ::Coluna.Algorithm.ReformData) at C:\Users\hugoh\.julia\packages\Coluna\uQdsR\src\Algorithm\colgen.jl:650
 [14] run!(::Coluna.Algorithm.ColumnGeneration, ::Coluna.Env, ::Coluna.Algorithm.ReformData, ::Coluna.Algorithm.OptimizationInput{Coluna.MathProg.Formulation{
Coluna.MathProg.DwMaster},Coluna.MathProg.MinSense}) at C:\Users\hugoh\.julia\packages\Coluna\uQdsR\src\Algorithm\colgen.jl:97
 [15] run!(::Coluna.Algorithm.ColCutGenConquer, ::Coluna.Env, ::Coluna.Algorithm.ReformData, ::Coluna.Algorithm.ConquerInput) at C:\Users\hugoh\.julia\package
s\Coluna\uQdsR\src\Algorithm\conquer.jl:170
 [16] apply_conquer_alg_to_node!(::Coluna.Algorithm.Node, ::Coluna.Algorithm.ColCutGenConquer, ::Coluna.Env, ::Coluna.Algorithm.ReformData, ::Dict{Tuple{Colun
a.ColunaBase.AbstractModel,Pair{DataType,DataType}},Coluna.Algorithm.UnitAccessMode}, ::Float64, ::Float64) at C:\Users\hugoh\.julia\packages\Coluna\uQdsR\src
\Algorithm\conquer.jl:61
 [17] run_conquer_algorithm!(::Coluna.Algorithm.TreeSearchAlgorithm, ::Coluna.Env, ::Coluna.Algorithm.TreeSearchRuntimeData{Coluna.MathProg.MinSense}, ::Colun
a.Algorithm.ReformData, ::Coluna.Algorithm.Node) at C:\Users\hugoh\.julia\packages\Coluna\uQdsR\src\Algorithm\treesearch.jl:252
 [18] run!(::Coluna.Algorithm.TreeSearchAlgorithm, ::Coluna.Env, ::Coluna.Algorithm.ReformData, ::Coluna.Algorithm.OptimizationInput{Coluna.MathProg.Formulati
on{Coluna.MathProg.DwMaster},Coluna.MathProg.MinSense}) at C:\Users\hugoh\.julia\packages\Coluna\uQdsR\src\Algorithm\treesearch.jl:330
 [19] optimize!(::Coluna.MathProg.Reformulation, ::Coluna.Env, ::Coluna.ColunaBase.Bound{Coluna.MathProg.Primal,Coluna.MathProg.MinSense}, ::Coluna.ColunaBase
.Bound{Coluna.MathProg.Dual,Coluna.MathProg.MinSense}) at C:\Users\hugoh\.julia\packages\Coluna\uQdsR\src\optimize.jl:98
 [20] macro expansion at C:\Users\hugoh\.julia\packages\Coluna\uQdsR\src\optimize.jl:65 [inlined]
 [21] macro expansion at C:\Users\hugoh\.julia\packages\TimerOutputs\4QAIk\src\TimerOutput.jl:190 [inlined]
 [22] optimize!(::Coluna.MathProg.Problem, ::Coluna.Annotations, ::Coluna.Params) at C:\Users\hugoh\.julia\packages\Coluna\uQdsR\src\optimize.jl:64
 [23] optimize! at C:\Users\hugoh\.julia\packages\Coluna\uQdsR\src\MOIwrapper.jl:101 [inlined]
 [24] optimize!(::MathOptInterface.Bridges.LazyBridgeOptimizer{Coluna.Optimizer}) at C:\Users\hugoh\.julia\packages\MathOptInterface\ZJFKw\src\Bridges\bridge_
optimizer.jl:264
 [25] optimize!(::MathOptInterface.Utilities.CachingOptimizer{MathOptInterface.AbstractOptimizer,MathOptInterface.Utilities.UniversalFallback{MathOptInterface
.Utilities.Model{Float64}}}) at C:\Users\hugoh\.julia\packages\MathOptInterface\ZJFKw\src\Utilities\cachingoptimizer.jl:215
 [26] optimize!(::JuMP.Model, ::Nothing; bridge_constraints::Bool, ignore_optimize_hook::Bool, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple
{(),Tuple{}}}) at C:\Users\hugoh\.julia\packages\JuMP\y5vgk\src\optimizer_interface.jl:139
 [27] optimize!(::JuMP.Model) at C:\Users\hugoh\.julia\packages\BlockDecomposition\OYIcC\src\BlockDecomposition.jl:51
 [28] optimize!(::JuMP.Model, ::Nothing; bridge_constraints::Bool, ignore_optimize_hook::Bool, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple
{(),Tuple{}}}) at C:\Users\hugoh\.julia\packages\JuMP\y5vgk\src\optimizer_interface.jl:129
 [29] optimize! at C:\Users\hugoh\.julia\packages\JuMP\y5vgk\src\optimizer_interface.jl:115 [inlined] (repeats 2 times)
 [30] macro expansion at C:\Users\hugoh\Documents\Codes\Atoptima\UFPB_Project\ColunaBenchmarks\src\CapacitatedLotSizing\run.jl:31 [inlined]
 [31] macro expansion at .\timing.jl:233 [inlined]
 [32] run_clsp(::String, ::Main.CapacitatedLotSizing.AppParams) at C:\Users\hugoh\Documents\Codes\Atoptima\UFPB_Project\ColunaBenchmarks\src\CapacitatedLotSiz
ing\run.jl:30
 [33] top-level scope at C:\Users\hugoh\Documents\Codes\Atoptima\UFPB_Project\ColunaBenchmarks\src\CapacitatedLotSizing\scripts\runsingleinstance.jl:13
 [34] include(::Function, ::Module, ::String) at .\Base.jl:380
 [35] include(::Module, ::String) at .\Base.jl:368
 [36] exec_options(::Base.JLOptions) at .\client.jl:296
 [37] _start() at .\client.jl:506
in expression starting at C:\Users\hugoh\Documents\Codes\Atoptima\UFPB_Project\ColunaBenchmarks\src\CapacitatedLotSizing\scripts\runsingleinstance.jl:13

Environment:

guimarqu commented 3 years ago

I need to reproduce the bug on my computer to fix it. Can you give me the name of the instance you try to optimize ?

hhkramer commented 3 years ago

Hi, Guillaume. You need to run with the Coluna parameters set as above in file run.jl of CapacitatedLotSizing application. To run, you can use:

julia src/CapacitatedLotSizing/scripts/runsingleinstance.jl X11119A

Please note it will take a few minutes to get to this error message. I am not sure, but I guess this happens when the algorithm reaches some stopping criterium, such as node limit, time limit etc.

Also, I get the same error if using the default depth first strategy.

guimarqu commented 3 years ago

I could reproduce the bug, thank you.

guimarqu commented 3 years ago

Ok I think I spotted the bug :

In node 240 (parent 239),

***************************************************************************************
**** BaB tree node N° 240, parent N° 239, depth 145, 132 open nodes
**** Local DB = 27234.3996, global bounds : [ 26871.7802 , Inf ], time = 112.73 sec.
**** Branching constraint: y[3,17]>=1.0
***************************************************************************************
 sync_solver Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster} 
  <it=  1> <et=112.89> <mst= 0.03> <sp= 0.13> <cols=10> <al= 0.00> <DB=27234.3996> <mlp=27119.7846> <PB=Inf>
[ Info: Column generation algorithm has converged.
 sync_solver Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster} 
 created MC_2000561 
# <it=  1> <et=113.03> <mst= 0.06> <sp= 0.08> <cols= 5> <al= 0.00> <DB=  511.4825> <mlp=   26.4772> <PB=Inf>
[ Info: Phase one determines infeasibility.

A column MC_2000561 is created (and added to the master's buffer) just before column generation proves infeasibility.

Next node explored is 241 (parent 239),

Node is already conquered. No children will be generated.
***************************************************************************************
**** BaB tree node N° 241, parent N° 239, depth 145, 131 open nodes
**** Local DB = 27234.3996, global bounds : [ 26871.7802 , Inf ], time = 113.03 sec.
**** Branching constraint: y[3,17]<=0.0
***************************************************************************************
 sync_solver Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster} 
> MC_2000561.
ERROR: LoadError: The index MathOptInterface.VariableIndex(-1) is invalid. Note that an index becomes invalid after it has been deleted

The subsolver is synchronized. We note that the variable MC_2000561 is not added to the subsolver because MC_2000561 is not a variable of a parent node. However, we try to update its cost. Since the variable is not in the subsolver we get the error.

Fix : do not update cost

guimarqu commented 3 years ago

@rrsadykov should we flush all buffers at the end of the each node ?

rrsadykov commented 3 years ago

I do not understand what is "subsolver", and also what is the "buffer" and what does "flush all buffers" means

guimarqu commented 3 years ago

I do not understand what is "subsolver", and also what is the "buffer" and what does "flush all buffers" means

rrsadykov commented 3 years ago

Normally, when passing from node to a node, the state of the master columns storage unit should be restored, thus the column in question should be deleted. Is the bug in this part? For me, as the column is not present in the formulation, the storage unit is restored correctly.

Unfortunately, I do not control the buffer part, as I did not implement it. I do not understand why the cost of the column is updated if this variable is not present in the formulation. It is not marked as non-active?

guimarqu commented 3 years ago

Yes it's normal that you don't control the buffer part. The buffer should automatically store changes done to the formulation.

In the buffer, variables added to the formulation and changes made on the cost of the variable are stored separately (fields changed_cost and vars_buffer) : https://github.com/atoptima/Coluna.jl/blob/43c44a43bbb9bbb220ad3e86aa1fbfd2410560e1/src/MathProg/buffer.jl#L48-L59

When one deactivates a variable, the variable is removed from the var_buffer but not removed from changed_cost. This is the reason of the bug.

So I think it's time to start unit tests on the buffer.

To reproduce this bug :

hhkramer commented 3 years ago

Ok I think I spotted the bug :

In node 240 (parent 239),

***************************************************************************************
**** BaB tree node N° 240, parent N° 239, depth 145, 132 open nodes
**** Local DB = 27234.3996, global bounds : [ 26871.7802 , Inf ], time = 112.73 sec.
**** Branching constraint: y[3,17]>=1.0
***************************************************************************************
 sync_solver Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster} 
  <it=  1> <et=112.89> <mst= 0.03> <sp= 0.13> <cols=10> <al= 0.00> <DB=27234.3996> <mlp=27119.7846> <PB=Inf>
[ Info: Column generation algorithm has converged.
 sync_solver Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster} 
 created MC_2000561 
# <it=  1> <et=113.03> <mst= 0.06> <sp= 0.08> <cols= 5> <al= 0.00> <DB=  511.4825> <mlp=   26.4772> <PB=Inf>
[ Info: Phase one determines infeasibility.

A column MC_2000561 is created (and added to the master's buffer) just before column generation proves infeasibility.

Next node explored is 241 (parent 239),

Node is already conquered. No children will be generated.
***************************************************************************************
**** BaB tree node N° 241, parent N° 239, depth 145, 131 open nodes
**** Local DB = 27234.3996, global bounds : [ 26871.7802 , Inf ], time = 113.03 sec.
**** Branching constraint: y[3,17]<=0.0
***************************************************************************************
 sync_solver Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster} 
> MC_2000561.
ERROR: LoadError: The index MathOptInterface.VariableIndex(-1) is invalid. Note that an index becomes invalid after it has been deleted

The subsolver is synchronized. We note that the variable MC_2000561 is not added to the subsolver because MC_2000561 is not a variable of a parent node. However, we try to update its cost. Since the variable is not in the subsolver we get the error.

Fix : do not update cost

Hi @guimarqu .

Did you reached the error using the default pricing solver by MIP in Coluna or the one in prod_plan_pricing_callback ?

Please consider to use the default from Coluna since I spotted a bug in my pricing callback that may lead to the addition of wrong columns. I will create a new merge request with the fixing soon.

This is the output using the default pricing MIP solver at the same point you reached the bug:

  <it=  2> <et=273.81> <mst= 0.05> <sp= 0.09> <cols= 0> <al= 0.00> <DB=28925.9000> <mlp=28925.9000> <PB=Inf>
[ Info: Column generation algorithm has converged.
[ Info: No branching candidates found. No children will be generated.
***************************************************************************************
**** BaB tree node N° 239, parent N° 233, depth 125, 121 open nodes
**** Local DB = 28096.5641, global bounds : [ 26871.7802 , Inf ], time = 273.82 sec.
**** Branching constraint: v<=0.0
***************************************************************************************
  <it=  1> <et=274.10> <mst= 0.12> <sp= 0.10> <cols= 5> <al= 0.00> <DB=28096.5641> <mlp=28110.7174> <PB=Inf>
  <it=  2> <et=274.27> <mst= 0.04> <sp= 0.12> <cols= 3> <al= 0.00> <DB=28096.5641> <mlp=28110.7174> <PB=Inf>
  <it=  3> <et=274.40> <mst= 0.04> <sp= 0.09> <cols= 1> <al= 0.00> <DB=28096.5641> <mlp=28110.7174> <PB=Inf>
  <it=  4> <et=274.52> <mst= 0.02> <sp= 0.09> <cols= 0> <al= 0.00> <DB=28110.7174> <mlp=28110.7174> <PB=Inf>
[ Info: Column generation algorithm has converged.
***************************************************************************************
**** BaB tree node N° 240, parent N° 239, depth 126, 122 open nodes
**** Local DB = 28110.7174, global bounds : [ 26871.7802 , Inf ], time = 274.54 sec.
**** Branching constraint: v>=1.0
***************************************************************************************
  <it=  1> <et=274.68> <mst= 0.03> <sp= 0.09> <cols= 0> <al= 0.00> <DB=28111.3429> <mlp=28111.3429> <PB=Inf>
[ Info: Column generation algorithm has converged.
***************************************************************************************
**** BaB tree node N° 241, parent N° 240, depth 127, 123 open nodes
**** Local DB = 28111.3429, global bounds : [ 26871.7802 , Inf ], time = 274.71 sec.
**** Branching constraint: v>=1.0
***************************************************************************************
  <it=  1> <et=274.87> <mst= 0.05> <sp= 0.09> <cols= 0> <al= 0.00> <DB=28117.9000> <mlp=28117.9000> <PB=Inf>
[ Info: Column generation algorithm has converged.
[ Info: No branching candidates found. No children will be generated.
rrsadykov commented 3 years ago

the variable is removed from the var_buffer but not removed from changed_cost

Ok, then one just needs to remove it from changed_cost

guimarqu commented 3 years ago

the variable is removed from the var_buffer but not removed from changed_cost

Ok, then one just needs to remove it from changed_cost

Yes and test because we might have other hidden bugs.

guimarqu commented 3 years ago

@hhkramer The bug is independant from the pricing solver so the fix will work for both cases.

hhkramer commented 3 years ago

OK, thanks @guimarqu !

Just as an update, after fixing the bug in my pricing callback I could not reach the error again. It is running for more than an hour and 1500+ nodes were solved.