CSPY for branch-bound-and-price algorithm

torressa / cspy

A collection of algorithms for the (Resource) Constrained Shortest Path problem in Python / C++ / C#

https://torressa.github.io/cspy/

MIT License

77 stars 24 forks source link

CSPY for branch-bound-and-price algorithm #98

Closed the-soomin-woo closed 1 year ago

the-soomin-woo commented 2 years ago

Hi, @torressa!

I am Soomin, and I have asked you a few questions already and I appreciate what you have done so far so much! I have some questions, which I have asked already for VRPY site as well: link.

I am applying cspy and vrpy for a branch-bound-and-price algorithm to solve a vehicle routing problem. My objective function is the total energy consumption and I have a heterogeneous fleet of hybrid electric vehicles. The edge costs are the energy consumptions, which depend on the current battery level of the vehicle. (At a low level, vehicle can't use battery and uses fuel instaed, consuming more energy. At a high level, battery is used with smaller energy. This is because a motor is more efficient than an engine.)

I have designed my own forward algorithm (no bidirectional or backward as the energy consumption should depend on the current battery level). On top of the monotonic, time, load, time windows, collect, and deliver resources, I designed some more resources. The resources include battery use, fuel use, battery/fuel level feasibility variables (=1 if infeasible, min and max at 0). Also, another resource is "the actual cost", which I made to pass through to the C++ code, because the current cspy uses a fixed edge cost (or edge weight). I made this arbitrarily so that I can calculate the edge weight to vary with the current battery level. I modified the C++ code in calculating the edge weight. Specifically, I changed from:

Label new_label( weight + adjacent_vertex.weight, ...
)

in the labelling.cc to

Label new_label( weight + new_resources[1] + adjacent_vertex.weight, ...
)

where new_resources[1] is the "the actual cost" that I pass through cspy forward algorithm and the 'weight' includes the dual values only. I kept the dominance check the same as it comes with cspy originally.

When I implement this, I have two odd results.

The column generation with a different initialization of the routes results in a different objective function value at the end with linear relaxation.
During the branch-bound-and-price search, I find that the objective function value with linear relaxation (the lower bound) sometimes improves with more bounding, i.e., the child node has a better objective value than the parent's node. For example, a parent node is the column generation without any additional constraints on an edge flow and its child node is the column generation with a constraint to visit the edge ('Source', 1). The child node has a better value than the parent node.

Because of these results, I believe that my version of the labeling algorithm solution is not optimal or exact. I am not using any other solver but cspy, and I am not ending the column generation at an arbitrary condition. I solve until no route is found with a negative reduced cost. (In the beginning, VRPY uses a 'heuristic' version of cspy that removes some unpromising edges from the graph. But VPRY only ends if no route is found with a negative reduced cost with the original graph.)

I am really stuck at this issue. Could you please help me or point me to where I might look into?

Thank you so much and I really appreciate your help.

Soomin

torressa commented 2 years ago

Hi again @soomin-woo!

Hmm, I'm not sure whether the standard dominance rules would work here. As your weight function depends on resources, not sure that we can use the weight criteria in the dominance check. I'll have a think about this, but in the mean time, you can check the following resources:

Irnich and Desaulniers (2005) general REFs and parts about piecewise linear costs and the SPPTWTC (very similar).
Ahmadi et al. (2021) very similar as well as they have a heuristic weight like in the A* shortest path algorithm. (The guys have some code linked in #67 if you want to have a look). I don't think they use the weight in the dominance checks but they have weight/cost bounding (which is also implemented here).

To play around with this just disable the weight checks in src/cc/labelling.cc function checkDominance. e.g. comment out these: https://github.com/torressa/cspy/blob/676c6aa03e0fc0fd1aa29d22681b343d0237a0e5/src/cc/labelling.cc#L164-L178

Also, just to be sure, please checkout #94. Maybe try and merge the elementary-dominance-check branch into yours to ensure that the elementary dominance check is correct. You might get some perfomance issues as now elementary=True takes much longer. I am working (slowly, but nearly there) on using non-elementary routes in VRPy which actually might be faster than before. I'll try and push a working branch this weekend.

I think with this, and the branching, we'll hopefully narrow down the weird behavior.

the-soomin-woo commented 2 years ago

Hi @torressa!

I think you're right, the dominance rule seems to be the most plausible explanation. I have checked out your recommended papers, and one more here, which proposes a labelling algorithm for the time-dependent travel time (edge costs).

I think this study is very similar to what I am trying to achieve, because their edge cost (travel time) depends on the resource state at the label (elapsed time). I can make my edge cost (energy consumption) to depend on the resource state at the label (current energy level). Their travel time function is piecewise linear, and I can make the energy consumption also a piecewise linear function of the current energy level. They point out some weaknesses of the conventional (plain? classic?) dominance rule for the problem. They design their custom dominance rules, which I hope to modify to fit to my problem!

Just FYI, I don't think the problem is the branch and bound algorithm. I checked again the bounding constraints and the graph structure, they should be quite alright. I also have the problem where I get a different column generation result with a different set of initial routes, so I suspect the labeling algorithm is behaving weirdly.

I will let you know when I propose a solution and have some progress.

By the way, do you know if cspy is buildable on Mac OS Monterey? If you remember, there was a problem with Python in Mac OS Big Sur and I had trouble building CSPY(link). I have been delaying my OS upgrade or another build of cspy just because of this (I'm not so CS savvy). I solved this problem by creating a python environment using pyenv last time (link). Do you have any recommendations on how to rebuild cspy for Monterey?

I can't tell you how much I appreciate your help, really. I am doing this research alone and I felt very stuck - your feedback is immensely helpful. I hope to deliver good news soon!

Best, Soomin

torressa commented 2 years ago

Cool! Will checkout the paper if I can.

Just FYI, I don't think the problem is the branch and bound algorithm. I checked again the bounding constraints and the graph structure, they should be quite alright.

Nice! Was just worth checking first, but yeh after I had that realisation, it is definitely the weight thing. Would be really cool if you could contribute it to VRPy when you can.

By the way, do you know if cspy is buildable on Mac OS Monterey?

I know for sure you can build it using the same steps as you outlined in https://github.com/torressa/cspy/issues/88#issuecomment-923426233, I've done it myself without any hassle (other than the installing the build deps: cmake, pyenv, etc). The only thing I found was cause I'm on an M1, check that the python versions you download work, as ow pyenv spits an error when trying to install.

I can't tell you how much I appreciate your help, really. I am doing this research alone and I felt very stuck - your feedback is immensely helpful. I hope to deliver good news soon!

Look forward to hearing more! Glad to hear I can help 😄 I know from experience it can be very tough.

the-soomin-woo commented 2 years ago

Hi @torressa!

I have found myself debugging almost from scratch, i.e., applying the state-varying edge cost. I wondered.. Is the python source code still valid? Do you know if it still works with vrpy? I don't know how to debug cspy in c++ and vrpy in python together. Even if the python cspy is slow, if I can get it working there and debug there faster, I will be able to implement in c++.

Thank you as always.

Soomin

torressa commented 2 years ago

The python code is still valid, what I do to run with vrpy:

Build the python package (as you already know)
Uninstall cspy from the machine

Tell vrpy where cspy is located. Before this line I add

import sys
sys.path.append("/full/path/to/cspy/build/python")

Then everytime you recompile the C++ and python code VRPy will be using the up-to-date version

the-soomin-woo commented 2 years ago

Hi @torressa, thank you! Just double checking again.. the bidrectional algorithm written in python uses the label extension from C++, right?

Best, Soomin

torressa commented 2 years ago

Just double checking again.. the bidrectional algorithm written in python uses the label extension from C++, right?

The only thing Python has control over is the REFCallbacks (if there are any registered) which simply return the resource array for a given edge, this is part of the extension.

Just for reference, the question has been asked on OR-SE and it's pretty clear we have a bug here.

To make sure that it is not an already known bug, can you please confirm you are using the branch I mentioned https://github.com/torressa/cspy/issues/98#issuecomment-1059073254.

Furthermore, can you also try removing the resource-based weight extension as discussed in https://github.com/Kuifje02/vrpy/issues/124#issuecomment-1060023118, this changes the structure of the pricing problems and is likely to be the source of the issue.

... the weight for each edge is being altered on the fly (dependant on resources), you may get a route which the final cost is falsely positive (or negative), which means that your linear relaxion is not (may not) being solved to optimality. ...

Give this a try and let's see if the problem with the weird behaviour still persists when branching. If it persists, you can also share this with me in some way we can try and narrow down the issue.

MariaSKAF commented 2 years ago

Hi @soomin-woo, did you manage to update the weights on the graph, while the algorithm is running?

torressa commented 2 years ago

Hi @torressa, I found that the bug was in the pricing problem. Below are the steps I followed to show that the bug was in the pricing problem, referring to the last comment by Ruslan Sadykov here.
1. At the root of branch-and-bound, I process the column generation until no negative reduced cost is found. Call the objective function value at the root as 'lower bound at root'

2. At the root, I save the dual values at the last iteration of the column generation (with linear relaxation). Call this 'root duals'. For these dual values, the pricing problem fails to find a negative column and the column generation terminates.

3. I branch on the root, ex) left branch to use ['Source', 1] and right branch to ban ['Source', 1], and solve the column generation for each branch.

4. I found that in one branch , one of the columns generated actually has a negative reduced cost with respect to dual at root. I calculate the negative reduced cost as cost of route - sum('root duals' for all nodes visited in the route). This is a bug because that column was not found at the root, though it has a negative reduced cost. Also, this branch has an objective function value _better_ than the lower bound at root. Note that the negative column in question was not selected in the decimal solution in that branch.
Currently, I am looking at the root node to figure out why the the negative column found after branching was not found at the root. For example, that negative column found after branching was ['Source', 14, 10, 12, 6, 1, 8, 2, 3, 9, 13, 5, 4, 7, 11, 'Sink']. (When it was found after branching, the duals were different to the root duals. Still, this route had a negative reduced cost w.r.t the root duals.) At the root, cspy only searches up to ['Source', 14, 10, 12, 6, 1, 8, 2] and doesn't explore further.

I am trying to trace the bug to the exact place in the cspy but it's a bit difficult as the cspy lives in C++. I'm calling 'logger.info' on many variables at each REF call back but I feel I could use a better debugging method. Do you have any suggestions?

We talked before that maybe it's because the edge cost is a function of the resource state at the last label. But I actually don't think so because that negative column was found for a battery electric vehicle type, where the edge cost is the same regardless of the resource state. (Edge costs change only for a hybrid EV type, because the energy cost depends on whether it uses expensive fuel or cheaper battery.)

I know this is a lot to unpack, but I'd love to hear your opinion.

Thank you as always!

Soomin

Originally from @soomin-woo from https://github.com/Kuifje02/vrpy/issues/124

torressa commented 2 years ago

@soomin-woo

I am trying to trace the bug to the exact place in the cspy but it's a bit difficult as the cspy lives in C++. I'm calling 'logger.info' on many variables at each REF call back but I feel I could use a better debugging method. Do you have any suggestions?

In the debug-logger branch I've just pushed a bunch of debugging statements for the process using spdlog. To really narrow it down you're going to have to dig a little bit into the C++ code as you might have to go into the label.checkDominance, label.checkFeasibility, or extend (all defined in labelling.cc).

Are you are using direction="both"?

If so, you're going to have to debug the joining procedure as well (joinLabels in bidirectional.cc). But this can get ugly, so I would suggest setting direction="forward" to begin with. Just add more statements as seen in the latest commit

SPDLOG_DEBUG("label={}, i={}, s={}", label.getString(), 10, "test");

See the spdlog repo for more info on this call.

First, you will have to merge the branch into yours (fixing any conflicts if they appear). Then compile the C++ library and link to Python (as discussed in https://github.com/torressa/cspy/issues/98#issuecomment-1065117175). Making sure that it is the correct version.

To be able to read this logging from C++, you will want to capture this output to a file. The easiest way of doing it from python is as

python3 myprog.py >> out.log

Then you can read the out.log file with any editor and start debugging.

Additionally, since you already know what partial path of interest, you can add statements in the C++ code to give you more verbose output for those labels. e.g. within checkDominance:

if (partial_path.back() == 2)
  SPDLOG_DEBUG("check something specific")

We talked before that maybe it's because the edge cost is a function of the resource state at the last label. But I actually don't think so because that negative column was found for a battery electric vehicle type, where the edge cost is the same regardless of the resource state. (Edge costs change only for a hybrid EV type, because the energy cost depends on whether it uses expensive fuel or cheaper battery.)

I am not sure about this, as adding columns hybrid EV columns will affect future columns of the EV type. Hence the problem may still be being caused due to these. In any case, to aid with debugging, I think you should try simplifying/removing this as we are trying to narrow down a problem in the pricing algorithm, this additional factor is only going to confuse things. In my experience, non-standard resource extensions (defined in a callback because they can't be defined on an edge basis a priori) are typically the source of all evil. As I'm sure you've already found, they are very hard to debug.

torressa commented 2 years ago

@MariaSKAF noticed that you managed! Please note that this may lead to weird results (optimal solutions not guaranteed).

torressa commented 1 year ago

Closing due to lack of response. Feel free to reopen.