Error in augmenting flow calculation

suddenlyAstral commented 1 year ago

Hi!

I'm using this (with credit) for a benchmarking assignment and I noticed a bug in augmenting flow calculation.

Because flow_path is created as a zip it's an iterable. This means that in augment_flow, after being read for bottleneck calculation, flow_path is emptied and the for loop that would actually augment the path runs over an empty iterator and halts immediately. An easy fix is to list the zip before calling augment_flow

https://github.com/odubno/ford-fulkerson-max-flow/blob/e3fd2487cf2ef8e2218b634e11808dda241a4c9b/ford_fulkerson.py#L38 ... https://github.com/odubno/ford-fulkerson-max-flow/blob/e3fd2487cf2ef8e2218b634e11808dda241a4c9b/ford_fulkerson.py#L13-L15

I'm just leaving this here to clear confusion for future people

suddenlyAstral commented 1 year ago

Also, the algorithm as implemented is actually Edmonds-Karp, which is more efficient and a special case of FF

When calculating a path source->sink FF uses any algorithm, while EK uses BFS specifically (as is the case here)

odubno commented 1 year ago

Hi, thanks for the thoughtful review! Some thoughts below

This means that in augment_flow, after being read for bottleneck calculation, flow_path is emptied and the for loop that would actually augment the path runs over an empty iterator and halts immediately.

flow_path is overwritten at each while loop, so even if it is getting modified in-place, it doesn't matter, it will get overwritten anyways at the next loop. The goal of flow_path is to update graph.
1. could you link to where flow_path is getting emptied? u and v values are being read from flow_path to then update graph that's returned by augment_flow(), then the while loop continues to overwrite flow_path, other than that i don't see any direct updates to flow_path.
If we wrap the zip() in a list() i.e. list(zip(path[:-1], path[1:])), the problem you mention should still persist. But I don't see where flow_path is updated other than inside the while loop i.e. by getting overwritten.

the algorithm as implemented is actually Edmonds-Karp You're totally right, thanks for catching that!

I kinda wish I wrote a test for this, :( when I did put this together. Unfortunately, Universities are slow to instill a unit-testing culture

suddenlyAstral commented 1 year ago

Surprised you answered, and quite prompty too.

1) flow_path is being emptied multiple times within a single while loop. bottleneck = min([graph[u][v]['capacity'] for u, v in flow_path]) reads it the first time. for u, v in flow_path: ... reads the second.

looping on iterators is compiled/interpreted like so:

for element in iterator_instance:
      do_thing(element)

turns into:

try:
     while True:
           element = next(iterator_instance)
           do_thing(element)
except StopIteration:  # this is the error raised when calling next() on an empty iterator
     pass

so making bottleneck in the syntactic-sugar-for-loop empties the iterator flow_path, and then the normal for loop leaves the while True immediately without updating capacity, etc.

2) First, a primer on iterators and generators: The reason iterators work like that is so they could work with generators/lazy evaluation, a paradigm where you return the i'th element without ever calculating the i+1'th element. This is useful when any of the following apply:

It's possible we will stop before the last element, so no use calculating them all
There are unlimited elements, e.g: iterate over primes
All elements won't fit into memory at once
calculating each element is heavy, so you'd like to start processing element 0 and getting partial results before/while calculating element 1 in the background

The reason turning it into a list solves the issue is that you read it once into the list (so full list, empty iterator), and then each time you use it you iterate over a list, which doesn't empty it.

(If you wanna get really technical, iterating over list actually iters over iter(list) which is an iterator object created internally which just returns list[i]. This is why in the implied while loop the error is StopIteration and not OutOfIndex. Every time you iterate a list you create a new iterator.)

Here are 2 great short videos on the subject: generators iterators

odubno commented 1 year ago

You're totally right, I didn't realize that zip() is actually an iterator. fixed.

odubno / ford-fulkerson-max-flow

Error in augmenting flow calculation #1