jamespayor / weighted-bipartite-perfect-matching

Fast C++ implementation of an O(NM) algorithm for the assignment problem.
MIT License
52 stars 11 forks source link

Big n #6

Open kmalinau opened 7 months ago

kmalinau commented 7 months ago

Hi @jamespayor , Thank you for developing this project. If it is still alive I have a suggestion and a question.

  1. When playing with big assignment problems I got a stack overflow due to arrays allocated on stack. I would suggest to replace them by std::vector allocating data in the heap.
  2. Do you think it is possible to modify the algorithm to support graph parts of different sizes $n_1 \le n_2$ ("rectangular matrix") without adding dummy nodes/edges? So that the perfect matching is the one covering all the nodes in the left part. In particular, I am interested in the case of $n_1 \ll n_2$ with big $n_2$ and wonder if we can get good performance if try to limit the number of edges.
jamespayor commented 7 months ago

Hi, thank you for the feedback!

Re (1), yeah that seems completely right, I'm not sure why I thought allocating those on the stack was a good idea.

Re (2), I'd bet it "just works" to switch to a general bipartite graph, and increase the matching until progress stops, at which point you'd have a min-weight maximal matching. This does seem better, especially since I think it should improve readability to differentiate both sides of the graph, and then inner loop will still do exactly the same work.

I'm busy right now but will hope to try these soon.

kmalinau commented 7 months ago

When playing with your code I realized that for the unbalanced problem case $n_1 \le n_2$ it is not only about changing the the right part bounds. I came up with more changes:

  1. Edge list initialization: remove rightEdgeCounts check from no solution detection.
  2. Node potential initialization: zeroing out all right potentials in the case $n_1 < n_2$, due to the fact that I considered zero-cost virtual dummy edges to $n_2 - n_1$ virtual dummy left nodes. Without doing that I got wrong results.

It seemed worked on my test cases but I did not try to prove it in general, just as I did not evaluate new complexity.