Topology - possible performance improvements in graph searches

dlohmeier commented 5 years ago

I have had a look at the functions connected_component and connected_components from the topology package again and thought about some performance improvement. There were two things that crossed my mind:

In the method connected_components a large part of time is wasted on the cornercases of having notravbuses. I suggest the following small change:

def connected_components(mg, notravbuses=set()):
    nodes = set(mg.nodes()) - notravbuses
    while nodes:
        cc = set(top.connected_component(mg, nodes.pop(), notravbuses=notravbuses))
        yield cc
        nodes -= cc
    # the above does not work if two notravbuses are directly connected
    if len(notravbuses) > 0:
        for f, t in mg.edges(notravbuses):
            if f in notravbuses and t in notravbuses:
                yield set([f, t])

I simply added the notravbuses to mg.edges() in the second loop. This would usually reduce the time drastically, as it searches only some of the edges. Also the if-statement could be left out then, as in case the argument is empty, an empty iterator is returned, so nothing happens.

The function connected_component spends a large share of time in appending and reading from the created stack list. As the returned iterator cannot be expected to take a specific order, it might make sense to use some set functions. I have implemented a recursive method that collects visited components in a set. It is this way slightly faster than connected_component for the cases that I tested it on.
```
def find_connected(mg, bus, notravbuses=set(), visited=None):
    visited = set() if visited is None else visited
    if bus in visited:
        return visited
    visited.add(bus)
    if bus in notravbuses:
        return visited
    for vis in mg[bus]:
        visited = find_connected(mg, vis, notravbuses, visited)
    return visited
```

So testing the timings delivers the following results. I always compare the initial implementation with the improvement of adding notravbuses as suggested in (1.) and then using the recursive find_connected method.

    import pandapower.networks as nw
    import pandapower.topology as top
    import copy

    def find_connected(mg, bus, notravbuses=set(), visited=None):
        visited = set() if visited is None else visited
        if bus in visited:
            return visited
        visited.add(bus)
        if bus in notravbuses:
            return visited
        for vis in mg[bus]:
            visited = find_connected(mg, vis, notravbuses, visited)
        return visited

    def connected_components(mg, notravbuses=set()):
        nodes = set(mg.nodes()) - notravbuses
        while nodes:
            cc = set(top.connected_component(mg, nodes.pop(), notravbuses=notravbuses))
            yield cc
            nodes -= cc
        # the above does not work if two notravbuses are directly connected
        for f, t in mg.edges(notravbuses):
            if f in notravbuses and t in notravbuses:
                yield set([f, t])

    def connected_components2(mg, notravbuses=set()):
        nodes = set(mg.nodes()) - notravbuses
        while nodes:
            cc = find_connected(mg, nodes.pop(), notravbuses=notravbuses)
            yield cc
            nodes -= cc
        # the above does not work if two notravbuses are directly connected
        for f, t in mg.edges(notravbuses):
            if f in notravbuses and t in notravbuses:
                yield set([f, t])

    def check_correctness(components_correct, components_checked):
        component_copy = copy.deepcopy(components_checked)
        for cc in components_correct:
            c2 = [c for c in component_copy if c == cc]
            assert len(c2) == 1
            c2 = c2[0]
            component_copy.pop(component_copy.index(c2))

    net = nw.mv_oberrhein()
    mg = top.create_nxgraph(net)

    # first test with empty notrav_buses
    notrav_buses1 = set()
    connected1 = list(top.connected_components(mg, notrav_buses1))
    connected2 = list(connected_components(mg, notrav_buses1))
    connected3 = list(connected_components2(mg, notrav_buses1))
    check_correctness(connected1, connected2)
    check_correctness(connected1, connected3)

    # notravbuses filled
    notrav_buses2 = set(net.switch.loc[net.switch.type == "CB", "bus"])
    connected4 = list(top.connected_components(mg, notrav_buses2))
    connected5 = list(connected_components(mg, notrav_buses2))
    connected6 = list(connected_components2(mg, notrav_buses2))
    check_correctness(connected4, connected5)
    check_correctness(connected4, connected6)

    # timings with Ipython
    % timeit connected1 = list(top.connected_components(mg, notrav_buses1))
    % timeit connected2 = list(connected_components(mg, notrav_buses1))
    % timeit connected3 = list(connected_components2(mg, notrav_buses1))
    % timeit connected4 = list(top.connected_components(mg, notrav_buses2))
    % timeit connected5 = list(connected_components(mg, notrav_buses2))
    % timeit connected6 = list(connected_components2(mg, notrav_buses2))

    433 µs ± 8.53 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    436 µs ± 9.27 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    333 µs ± 5.02 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    571 µs ± 3.67 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    446 µs ± 5.23 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    342 µs ± 12.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

So in case notravbuses are given, the first suggestion decreases the timing from 571 to 446 µs, otherwise it is roughly the same. The recursive implementation of connected component can decrease the timing from ~440µs to ~335µs. Yet I can imagine that there are still better, clearer and maybe more performant ways to do this.

It might be worthwhile checking if this can be implemented, as it makes some difference in large scale analyses.

ascheidl commented 5 years ago

(1) good idea

(2) CPython has a recursion deph limit (default 1000?). A recursive implementation consequently breaks at graphs with larger components. So we should avoid recursion here.

Moreover, appending and popping elements at the stack list should not be the bottleneck. Indeed, I tried to use a deque instead (which has O(1) access to first and last element) and it didnt help a lot.

However, I reimplemented connected_component without using exceptions. This implementation is on-par performance wise with your recusive approach.

I will integrate it soon.

dlohmeier commented 5 years ago

I opened a pull request #267 to implement the changes with the for-loop and the call to mg.edges with notravbuses. I chose the deque implementation over the list, in my tests it was just about 3% faster. I had also thought about whether or not the iterator iter(mg[bus]) is really required, as networkx 2.x returns an AdjacencyView when calling mg[bus]. However, with respect to timing I did not find any difference really. It seems like memory usage is the same with or without iter, but I am not sure, as I didn't manage to run a memory profiler properly.

lthurner commented 5 years ago

Can this be closed now that #267 has been merged?

dlohmeier commented 5 years ago

Yes, definitely :-)

e2nIEE / pandapower

Topology - possible performance improvements in graph searches #256