Closed hhkramer closed 3 years ago
I need to reproduce the bug on my computer to fix it. Can you give me the name of the instance you try to optimize ?
Hi, Guillaume. You need to run with the Coluna parameters set as above in file run.jl of CapacitatedLotSizing application. To run, you can use:
julia src/CapacitatedLotSizing/scripts/runsingleinstance.jl X11119A
Please note it will take a few minutes to get to this error message. I am not sure, but I guess this happens when the algorithm reaches some stopping criterium, such as node limit, time limit etc.
Also, I get the same error if using the default depth first strategy.
I could reproduce the bug, thank you.
Ok I think I spotted the bug :
In node 240 (parent 239),
***************************************************************************************
**** BaB tree node N° 240, parent N° 239, depth 145, 132 open nodes
**** Local DB = 27234.3996, global bounds : [ 26871.7802 , Inf ], time = 112.73 sec.
**** Branching constraint: y[3,17]>=1.0
***************************************************************************************
sync_solver Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster}
<it= 1> <et=112.89> <mst= 0.03> <sp= 0.13> <cols=10> <al= 0.00> <DB=27234.3996> <mlp=27119.7846> <PB=Inf>
[ Info: Column generation algorithm has converged.
sync_solver Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster}
created MC_2000561
# <it= 1> <et=113.03> <mst= 0.06> <sp= 0.08> <cols= 5> <al= 0.00> <DB= 511.4825> <mlp= 26.4772> <PB=Inf>
[ Info: Phase one determines infeasibility.
A column MC_2000561
is created (and added to the master's buffer) just before column generation proves infeasibility.
Next node explored is 241 (parent 239),
Node is already conquered. No children will be generated.
***************************************************************************************
**** BaB tree node N° 241, parent N° 239, depth 145, 131 open nodes
**** Local DB = 27234.3996, global bounds : [ 26871.7802 , Inf ], time = 113.03 sec.
**** Branching constraint: y[3,17]<=0.0
***************************************************************************************
sync_solver Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster}
> MC_2000561.
ERROR: LoadError: The index MathOptInterface.VariableIndex(-1) is invalid. Note that an index becomes invalid after it has been deleted
The subsolver is synchronized. We note that the variable MC_2000561
is not added to the subsolver because MC_2000561
is not a variable of a parent node.
However, we try to update its cost. Since the variable is not in the subsolver we get the error.
Fix : do not update cost
@rrsadykov should we flush all buffers at the end of the each node ?
I do not understand what is "subsolver", and also what is the "buffer" and what does "flush all buffers" means
I do not understand what is "subsolver", and also what is the "buffer" and what does "flush all buffers" means
Normally, when passing from node to a node, the state of the master columns storage unit should be restored, thus the column in question should be deleted. Is the bug in this part? For me, as the column is not present in the formulation, the storage unit is restored correctly.
Unfortunately, I do not control the buffer part, as I did not implement it. I do not understand why the cost of the column is updated if this variable is not present in the formulation. It is not marked as non-active?
Yes it's normal that you don't control the buffer part. The buffer should automatically store changes done to the formulation.
In the buffer, variables added to the formulation and changes made on the cost of the variable are stored separately (fields changed_cost
and vars_buffer
) :
https://github.com/atoptima/Coluna.jl/blob/43c44a43bbb9bbb220ad3e86aa1fbfd2410560e1/src/MathProg/buffer.jl#L48-L59
When one deactivates a variable, the variable is removed from the var_buffer
but not removed from changed_cost
. This is the reason of the bug.
So I think it's time to start unit tests on the buffer.
To reproduce this bug :
Ok I think I spotted the bug :
In node 240 (parent 239),
*************************************************************************************** **** BaB tree node N° 240, parent N° 239, depth 145, 132 open nodes **** Local DB = 27234.3996, global bounds : [ 26871.7802 , Inf ], time = 112.73 sec. **** Branching constraint: y[3,17]>=1.0 *************************************************************************************** sync_solver Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster} <it= 1> <et=112.89> <mst= 0.03> <sp= 0.13> <cols=10> <al= 0.00> <DB=27234.3996> <mlp=27119.7846> <PB=Inf> [ Info: Column generation algorithm has converged. sync_solver Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster} created MC_2000561 # <it= 1> <et=113.03> <mst= 0.06> <sp= 0.08> <cols= 5> <al= 0.00> <DB= 511.4825> <mlp= 26.4772> <PB=Inf> [ Info: Phase one determines infeasibility.
A column
MC_2000561
is created (and added to the master's buffer) just before column generation proves infeasibility.Next node explored is 241 (parent 239),
Node is already conquered. No children will be generated. *************************************************************************************** **** BaB tree node N° 241, parent N° 239, depth 145, 131 open nodes **** Local DB = 27234.3996, global bounds : [ 26871.7802 , Inf ], time = 113.03 sec. **** Branching constraint: y[3,17]<=0.0 *************************************************************************************** sync_solver Coluna.MathProg.Formulation{Coluna.MathProg.DwMaster} > MC_2000561. ERROR: LoadError: The index MathOptInterface.VariableIndex(-1) is invalid. Note that an index becomes invalid after it has been deleted
The subsolver is synchronized. We note that the variable
MC_2000561
is not added to the subsolver becauseMC_2000561
is not a variable of a parent node. However, we try to update its cost. Since the variable is not in the subsolver we get the error.Fix : do not update cost
Hi @guimarqu .
Did you reached the error using the default pricing solver by MIP in Coluna or the one in prod_plan_pricing_callback
?
Please consider to use the default from Coluna since I spotted a bug in my pricing callback that may lead to the addition of wrong columns. I will create a new merge request with the fixing soon.
This is the output using the default pricing MIP solver at the same point you reached the bug:
<it= 2> <et=273.81> <mst= 0.05> <sp= 0.09> <cols= 0> <al= 0.00> <DB=28925.9000> <mlp=28925.9000> <PB=Inf>
[ Info: Column generation algorithm has converged.
[ Info: No branching candidates found. No children will be generated.
***************************************************************************************
**** BaB tree node N° 239, parent N° 233, depth 125, 121 open nodes
**** Local DB = 28096.5641, global bounds : [ 26871.7802 , Inf ], time = 273.82 sec.
**** Branching constraint: v<=0.0
***************************************************************************************
<it= 1> <et=274.10> <mst= 0.12> <sp= 0.10> <cols= 5> <al= 0.00> <DB=28096.5641> <mlp=28110.7174> <PB=Inf>
<it= 2> <et=274.27> <mst= 0.04> <sp= 0.12> <cols= 3> <al= 0.00> <DB=28096.5641> <mlp=28110.7174> <PB=Inf>
<it= 3> <et=274.40> <mst= 0.04> <sp= 0.09> <cols= 1> <al= 0.00> <DB=28096.5641> <mlp=28110.7174> <PB=Inf>
<it= 4> <et=274.52> <mst= 0.02> <sp= 0.09> <cols= 0> <al= 0.00> <DB=28110.7174> <mlp=28110.7174> <PB=Inf>
[ Info: Column generation algorithm has converged.
***************************************************************************************
**** BaB tree node N° 240, parent N° 239, depth 126, 122 open nodes
**** Local DB = 28110.7174, global bounds : [ 26871.7802 , Inf ], time = 274.54 sec.
**** Branching constraint: v>=1.0
***************************************************************************************
<it= 1> <et=274.68> <mst= 0.03> <sp= 0.09> <cols= 0> <al= 0.00> <DB=28111.3429> <mlp=28111.3429> <PB=Inf>
[ Info: Column generation algorithm has converged.
***************************************************************************************
**** BaB tree node N° 241, parent N° 240, depth 127, 123 open nodes
**** Local DB = 28111.3429, global bounds : [ 26871.7802 , Inf ], time = 274.71 sec.
**** Branching constraint: v>=1.0
***************************************************************************************
<it= 1> <et=274.87> <mst= 0.05> <sp= 0.09> <cols= 0> <al= 0.00> <DB=28117.9000> <mlp=28117.9000> <PB=Inf>
[ Info: Column generation algorithm has converged.
[ Info: No branching candidates found. No children will be generated.
the variable is removed from the
var_buffer
but not removed fromchanged_cost
Ok, then one just needs to remove it from changed_cost
the variable is removed from the
var_buffer
but not removed fromchanged_cost
Ok, then one just needs to remove it from
changed_cost
Yes and test because we might have other hidden bugs.
@hhkramer The bug is independant from the pricing solver so the fix will work for both cases.
OK, thanks @guimarqu !
Just as an update, after fixing the bug in my pricing callback I could not reach the error again. It is running for more than an hour and 1500+ nodes were solved.
Describe the bug When using Coluna to solve my application I get the following error after some nodes of the branch-and-price tree:
ERROR: LoadError: The index MathOptInterface.VariableIndex(-1) is invalid. Note that an index becomes invalid after it has been deleted.
I am using the default
TreeSearchAlgorithm
params except that Primal Heuristics are disabled.To Reproduce Coluna parameters are set as follows:
Expected behavior
Environment: