Closed bradbeattie closed 13 years ago
One possibility is to remove any candidates that no one prefers.
input = [
{ "count":1, "ballot":{"A":9, "B":1, "C":1, "D":9, "E":9, "F":2, "G":9, "H":9, "I":9, "J":9 }},
{ "count":1, "ballot":{"A":3, "B":2, "C":3, "D":1, "E":9, "F":9, "G":9, "H":9, "I":9, "J":9 }},
{ "count":1, "ballot":{"A":9, "B":9, "C":9, "D":9, "E":1, "F":9, "G":9, "H":9, "I":9, "J":9 }}
]
output = SchulzeSTV.calculate_winner(input, 5, "ranking")
could become
input = [
{ "count":1, "ballot":{"A":9, "B":1, "C":1, "F":2, "G":9, "H":9, "I":9, "J":9 }},
{ "count":1, "ballot":{"A":3, "B":2, "C":3, "F":9, "G":9, "H":9, "I":9, "J":9 }},
{ "count":1, "ballot":{"A":9, "B":9, "C":9, "F":9, "G":9, "H":9, "I":9, "J":9 }}
]
output = SchulzeSTV.calculate_winner(input, 3, "ranking")
which takes a lot less time. Seems like a poor patch hiding the real underlying performance issues.
I think this chunk could be extracted:
# Generate the list of patterns we need to complete
completion_patterns = []
for i in range(0,len(other_candidates)):
for j in range(0, i+1):
completion_patterns.append(list(set((pattern[0]) for pattern in itertools.groupby(itertools.permutations([2]*(len(other_candidates)-i)+[1]*(j)+[3]*(i-j))))))
completion_patterns = [item for innerlist in completion_patterns for item in innerlist]
It runs every time the function does, but always generates the same data.
Testing with
clear; nosetests tests/test_schulze_stv.py; nosetests tests/test_schulze_stv_performance.py -s
to ensure the validity of the results while boosting performance.
Schulze's implementation in C works fine at 10 candidates, 5 winners (under 1 second to process). It chokes on 15 candidates, 7 winners though.
The most recent modification generates unique lexicographic completion patterns without generating all possibilities. 10 candidates / 9 winners runs fine now. 10 candidates / 5 winners runs in 8 seconds. Will continue to look for performance optimizations.
Tried shifting the adjacency matrix implementation over to pygraph's max flow algorithm (certainly better from a code-reuse standpoint). 10c5w runs in 24 seconds, a 300% increase, pretty much all of which is spent inside their implementation of Edmonds Karp.
Profiling the code (nosetests --with-profile) confirms that this is the major killer.
3516555 function calls (3216445 primitive calls) in 8.119 CPU seconds Ordered by: internal time, call count List reduced from 330 to 20 due to restriction <20> ncalls tottime percall cumtime percall filename:lineno(function) 9993 4.565 0.000 6.659 0.001 pygraph/algorithms/minmax.py:310(maximum_flow) 519680 0.596 0.000 0.854 0.000 pygraph/mixins/labeling.py:66(edge_weight) 316966 0.546 0.000 0.546 0.000 pygraph/classes/digraph.py:59(nodes) 1843 0.369 0.000 0.386 0.000 schulze_stv.py:128(__proportional_completion_round__)
Reverting back to the capacity matrix implementation and running the profiling yields the following output:
809981 function calls (509871 primitive calls) in 2.932 CPU seconds Ordered by: internal time, call count List reduced from 332 to 20 due to restriction <20> ncalls tottime percall cumtime percall filename:lineno(function) 35777 1.480 0.000 1.480 0.000 schulze_stv.py:268(__bfs__) 9993 0.346 0.000 1.910 0.000 schulze_stv.py:243(__edmonds_karp__) 325075/30607 0.342 0.000 0.342 0.000 schulze_stv.py:93(__unique_permutations__) 1811 0.337 0.000 0.353 0.000 schulze_stv.py:131(__proportional_completion_round__)
Could this just be a matter of the C-implementation being faster than the Python or is there something algorithmically different with how I've coded it? Hrm...
The edge-weight cutoff has produced a significantly faster algorithm as we don't need to waste time on max-flow iterations that will simply result in 0. I can't guarantee that this performance tweak will produce the same result in every case, but it passes all current functionality tests. Might want to consider bringing in Tideman's A0X set.
Slows down to a crawl.