crflynn / voting

Diversity / (dis)proportionality measures, election quotas, and apportionment methods in pure Python.
https://voting.readthedocs.io/en/latest/
MIT License
15 stars 2 forks source link

jefferson and dhondt voting calculation is incorrect #1

Closed john-sandall closed 5 years ago

john-sandall commented 5 years ago

Nice looking package, hugely appreciate the work you've put into this so far. I was just trying it out for calculating some historical EU Parliamentary Election allocations, which all use d'Hondt, and I noticed the calculated numbers weren't accurate.

Specific example

In 2014 UK European Parliamentary election in the East of England we had results as documented by the BBC here. UKIP got 3 MEPs, Conservatives 3, Labour 1.

If we try those vote numbers, we get an incorrect result:

votes = [542812, 446569, 271601, 133331, 108010, 26564, 16497, 12465, 11627, 4870]
seats = 7
apportionment.dhondt(votes, seats)
# returns `[3, 2, 2, 0, 0, 0, 0, 0, 0, 0]`

The problem

I looked at your code for dhondt/jefferson and it looks like it's missing out on the Conservative's third seat which comes out of a second allocation, i.e. you should re-calculate the diffs every time you make an allocation.

The solution

I double checked the maths as well as using an online calculator, and then had a quick look at other implementations and discovered this implementation in the SciencesPo R package. It seemed nice and roughly replicates exactly my process when doing this manually, so I coded up a quick implementation in Python but it uses some numpy to replicate some of the in-built R functionality:

def dhondt(votes, seats):
    parties = np.arange(len(votes))
    dhondt_matrix = np.repeat(parties, seats)
    scores = (np.repeat(votes, seats).reshape(len(votes), seats) / np.arange(1, seats + 1)).ravel()
    dhondt_matrix = dhondt_matrix[(-scores).argsort()][:seats]
    allocated = np.repeat(0, len(votes))
    for i in dhondt_matrix:
        allocated[i] += 1
    return allocated

Next steps

I know you don't have numpy as a dependency for this library yet. If you're happy to add it, I'll send through a PR and update dependencies / tests as appropriate. If not, I can look at adapting your current code using only base Python. As far as I know, all of the above numpy is Python 2 compatible, but I'd test that anyway on submission of a PR.

crflynn commented 5 years ago

@john-sandall nice catch and thanks for pointing this out. I haven't looked at the fix you provided in detail but I will gladly accept a PR that corrects the algorithm.

I would prefer to use pure python here as I think the math in most cases is simple enough and I can't see a reason for this package to be used in any high performance use cases that would warrant including numpy.