REFCallback is receiving non existing nodes

tomatoes-prog commented 3 years ago

Describe the bug

When I am trying to use Bidirectional algorithm the REFCallback object is receiving nonexistent nodes as head and the partial path seems to be weird.

I am using the Algorithm to solve a RCSPP with capacity and time windows for a column generation problem with 1423 edges and 105 (107 with 'Source' and 'Sink') nodes. The problems arises when I try to build my own REFCallback class and using the forward direction. The REF_fwd function is receiving tail and head as numeric values even if all the nodes are strings as shown in next image:

I would assume that it is just cast problem but at some moment I receive the head 0 which does not exist in the Graph and I dont know if that zero is the Source or the sink, or just a wrong node

This didnt happened until I upgrade the package to v1.0.0 with pip, but I had to do this because the previous version was too slow to find a optimal solution which is unpractical for column generation. Am I doing something wrong?

To Reproduce Steps to reproduce the behavior. Include minimal working example if available.

I will include the practical example I am using with a pickle graph that you can load with networkx, all the neccesary code is down below

`
from networkx import DiGraph from cspy import BiDirectional, REFCallback import networkx as nx

  class MyCallback(REFCallback):
          def __init__(self, arg1, arg2, max_res):
              # You can use custom arguments and save for later use
              REFCallback.__init__(self) # Init parent
              self.G: arg1
              self._arg2: bool = arg2
              self.max_res = max_res

          def REF_fwd(self, cumul_res, tail, head, edge_res, partial_path,cumul_cost):
              res_new = list(cumul_res)
              #Creciente
              res_new[0] += edge_res[0]
              #Capacidad
              res_new[1] += edge_res[1]
              print('Partial path:' + str(partial_path) + '   head:' + str(head))
              new_time = max(res_new[2] + edge_res[2] + 15/60 , G.nodes[str(head)]['a'] )
              res_new[2] = new_time
              if new_time > G.nodes[str(head)]['b']:
                  res_new[3] = 1
              return res_new

          def REF_bwd(self, cumul_res, tail, head, edge_res, partial_path,cumul_cost):
              res_new = list(cumul_res) # local copy
              return res_new

          def REF_join(self, fwd_resources, bwd_resources, tail, head, edge_res):
              final_res = np.zeros(len(self.max_res))
              return fwd_res

  #Loads the pickle with networkx
  G = nx.read_gpickle('graph_test.cspy')
  resources = ['vstQty', 'Q', 'time', 'time-windows']
  min_res = [0, 0, 0, 0]
  max_res = [20, 30, 24, 1]
  alg = BiDirectional(G, max_res=max_res, min_res=min_res, REF_callback=MyCallback(G, True, max_res),
                    direction='forward', elementary=True)
  alg.run()

` You can download the graph here

graph_test.zip

Expected behavior A clear and concise description of what you expected to happen. The head,tail should be string as the nodes in the graph, the partial path should have the 'Source' and 'Sink' nodes Desktop (please complete the following information):

OS: [Windows 10 x64]
Version [v1.0.0]

Suggestions How would you suggest the problem be fixed? Maybe the problem could be because I am using non-positive costs? maybe this is a problem with the connection to C++ backend with cpp?

tomatoes-prog commented 3 years ago

I also noted that the version cspy==1.0.0a0 works as I expected, but i dont know what changed with the 1.0.0 version. I would like to know if there is a performance improvement on alpha version vs 1.0.0 version.

torressa commented 3 years ago

Hi @tomatoes-prog!

REF conversion

Sorry I should've been clearer in the docs and release notes, in v1.0 internally I run nx.convert_node_labels_to_integers before I pass the graph to algorithm (it's just easier than handling different types as I'm not using a custom graph implementation any more).

So, what I do, is before calling .run() I get the converted graph (and source and sink ids if you need them) and pass them to the callback. For example,

class MyCallback(REFCallback):
    def __init__(self, max_res):
        # You can use custom arguments and save for later use
        REFCallback.__init__(self)  # Init parent
        self.max_res = max_res
        # set later
        self._G = None
        self._source_id = None
        self._sink_id = None

    # Define functions appropriately (no need to define the ones you are not using)

my_callback = MyCallback(max_res)

alg = BiDirectional(G,
                    max_res=max_res,
                    min_res=min_res,
                    REF_callback=my_callback,
                    direction='forward',
                    elementary=True)

# set callback attributes
my_callback._G = alg.G
my_callback._source_id = alg._source_id
my_callback._sink_id = alg._sink_id

alg.run()

# Returns path using original labelling
p = alg.path()

This is what I did in vrpy (see here). In the REFCallback, you can query node (and edge) attributes normally (see here) using the REF arguments which is alright.

This is a work-around so would welcome a PR with a fix.

Performance difference

I haven't got a direct comparison from v1.0-alpha and v1.0 but you can see the evolution for the benchmarks in #65. In general direction="forward" and "both" should be faster, significantly faster in some cases (see same issue and #66)

I'd be happy to hear about your experience.

Also, if you're struggling with performance and your graph is not that big, it's probably due to the looseness or poor definition of the resources (e.g. you can try using 'Q' or 'time' instead of 'vstQty' as the first resource, as long as they are increasing). Another thing you can try are some pricing heuristics, you can read more on this in the vrpy docs they are implemented in the repo too (in vrpy/subproblem.py)

torressa commented 2 years ago

Closing for now

torressa / cspy

REFCallback is receiving non existing nodes #77

REF conversion

Performance difference