Equal-Vote / starpy

Python implementation of the STAR Voting system
https://www.starvoting.org/
BSD 3-Clause "New" or "Revised" License
8 stars 4 forks source link

Floating point arithmetic can result in different election results #23

Open akazukin5151 opened 1 year ago

akazukin5151 commented 1 year ago

TLDR: the 1000th example of why floating point arithmetic is bad, in an excessively long post

Running the same election with starpy (Python) and Equal-Vote's star-core (Javascript) gives different results.

I am using these ballots from the star-core test cases, for STAR-PR:

https://github.com/Equal-Vote/star-core/blob/9a554932e4fc7ae9d6eb295fa1076437c812f5ce/src/Tests/pr.test.js#L105-L122

After spending an hour staring at two different debuggers simultaneously, I think the reason is because pandas in starpy somehow has different floating point behavior, causing a different candidate to be elected in the first round. This cascades throughout the end and we get a different election result overall.

Steps to reproduce

The ballot data is copied from star-core (JS) tests as linked above

# Paste in everything from https://github.com/Equal-Vote/starpy/blob/main/starpy/Allocated_Score.py here
b = [
    [0., 0., 3., 2., 3., 0., 2., 1.],
    [3., 2., 4., 4., 4., 3., 4., 2.],
    [3., 3., 2., 1., 2., 4., 3., 0.],
    [4., 3., 1., 2., 0., 4., 4., 2.],
    [4., 2., 1., 0., 2., 2., 1., 0.],
    [1., 0., 2., 1., 1., 1., 4., 2.],
    [3., 0., 3., 0., 2., 2., 0., 4.],
    [2., 0., 2., 4., 1., 3., 0., 2.],
    [1., 4., 0., 1., 1., 1., 2., 4.],
    [2., 3., 4., 2., 0., 2., 3., 3.],
    [0., 0., 3., 3., 0., 0., 2., 3.],
    [1., 2., 3., 4., 3., 3., 1., 4.],
    [2., 3., 0., 0., 2., 4., 4., 3.],
    [2., 1., 1., 1., 2., 1., 3., 0.],
    [1., 0., 4., 3., 1., 3., 0., 0.],
]

n_cands = 8
n_voters = 15
ballots = pd.DataFrame(b)
n_seats = 3
assert ballots.shape[1] >= n_seats
winners = Allocated_Score(5, n_seats, ballots)
print(winners)

The answer I get is [2, 6, 7]. This is 0-indexed so candidates 3, 7, and 8 are elected. This does not match the JS tests, where candidates 1, 3, and 7 are elected:

https://github.com/Equal-Vote/star-core/blob/9a554932e4fc7ae9d6eb295fa1076437c812f5ce/src/Tests/pr.test.js#L129

What is happening

Let's look at the normalized ballots. Each row is a voter and each column is a candidate. There are 15 voters and eight candidates for a 3 seat election:

      0    1    2    3    4    5    6    7
0   0.0  0.0  0.6  0.4  0.6  0.0  0.4  0.2
1   0.6  0.4  0.8  0.8  0.8  0.6  0.8  0.4
2   0.6  0.6  0.4  0.2  0.4  0.8  0.6  0.0
3   0.8  0.6  0.2  0.4  0.0  0.8  0.8  0.4
4   0.8  0.4  0.2  0.0  0.4  0.4  0.2  0.0
5   0.2  0.0  0.4  0.2  0.2  0.2  0.8  0.4
6   0.6  0.0  0.6  0.0  0.4  0.4  0.0  0.8
7   0.4  0.0  0.4  0.8  0.2  0.6  0.0  0.4
8   0.2  0.8  0.0  0.2  0.2  0.2  0.4  0.8
9   0.4  0.6  0.8  0.4  0.0  0.4  0.6  0.6
10  0.0  0.0  0.6  0.6  0.0  0.0  0.4  0.6
11  0.2  0.4  0.6  0.8  0.6  0.6  0.2  0.8
12  0.4  0.6  0.0  0.0  0.4  0.8  0.8  0.6
13  0.4  0.2  0.2  0.2  0.4  0.2  0.6  0.0
14  0.2  0.0  0.8  0.6  0.2  0.6  0.0  0.0

The first problem happens in column 2. We are summing to find the candidate with the highest score, so for column 2, we do this: 0.6+0.8+0.4+0.2+0.2+0.4+0.6+0.4+0.0+0.8+0.6+0.6+0.0+0.2+0.8.

https://github.com/Equal-Vote/starpy/blob/77e00d97ba6587b04a6c9aa0d83d7c5de38f4405/starpy/Allocated_Score.py#L21

In JS, this is 6.599999999999999. In my Python debugger and script, it is also 6.599999999999999. However when pandas actually sums it up, it becomes 6.6.

Here's what my debugger session looks like:

> /home/me/star_pr.py(187)Allocated_Score()
-> w = weighted_scores.sum().idxmax()
(Pdb) weighted_scores.sum()
0    5.8
1    4.6
2    6.6
3    5.6
4    4.8
5    6.6
6    6.6
7    6.0
dtype: float64
(Pdb) weighted_scores.sum()[2]
6.6
(Pdb) type(weighted_scores.sum()[2])
<class 'numpy.float64'>
(Pdb) float(weighted_scores.sum()[2])
6.6
(Pdb) type(float(weighted_scores.sum()[2]))
<class 'float'>

Hence it is not an artifact of pandas/numpy rounding for printing.

Compare the sums with the sums in JS (emphasis mine):

[
    5.800000000000002,
    4.6,
    6.599999999999999,  // <--- #2
    5.6000000000000005,
    4.800000000000001,
    6.6,  // <--- #5
    6.6000000000000005,  // <--- #6
    5.999999999999999
]

(Setting a breakpoint here: https://github.com/Equal-Vote/star-core/blob/9a554932e4fc7ae9d6eb295fa1076437c812f5ce/src/StarResults.js#L442)

For comparison, normal Python sums column 5 to 6.6, and column 6 to 6.6000000000000005. So this is a problem with pandas/numpy

The consequences is that indices 2, 5, and 6 gets the same value of 6.6 in Python, but in JS they are all different. In JS, the highest score belongs to candidate 6, but since there is a tie in Python, it arbitrarily selects index 2.

Python elects index 2 then 6. JS elects index 6 then 2. But the error has cascaded too much for the final seat. The out of order election resulted in different re-weightings. See the debugging values:

Value Candidate JS Python
split_point 7 0.8 3.0
spent_above 7 0 3.0
weight_on_split 7 4 2.0
split_point 3 0.6 0.6
spent_above 3 2.0 3.0
weight_on_split 3 4.0 4
Raw data (hopefully I transcribed it correctly)
js split_point for 7 is 0.8
python split_point for 3 is 0.6

js spent_above for 7 is 0
python spent_above for 3 is 3.0

js weight_on_split for 7 is 4
python weight_on_split for 3 is 4

python split_point for 7 is 0.6
js split_point for 3 is 0.6

python spent_above for 7 is 3.0
js spent_above for 3 is 2.0

python weight_on_split for 7 is 2.0
js weight_on_split for 3 is 4.0

The sum of scores at the end of the 2nd round is completely different, causing Python to elect candidate 8 and JS to elect candidate 1.

What can we do about this?

Some projects such as tallystick offers the ability to use exact representations for rational fractions, or fixed point decimals. (For example, Scottish Councils use 5 d.p. for STV)

I'm not engaged in the STAR project, but I think if the code is supposed to be a "gold standard" then it might be worth it to show that it is rigorous, production, and real world ready.

I know it's not easy to change something so fundamental, and I totally understand because I haven't bothered with more exact decimals in my own project too. But at least people can see this issue and evaluate this for their own needs. I might not need this issue to be fixed, but hopefully it will save time for someone else who does.

(I stumbled on to this issue, because I was going to use the JS tests to validate my own implementation. This has implications for #2 and #9 as the JS tests couldn't be simply copied to Python)

Misc

This issue is appropriate for this repo or the star-core repo, but I think this repo the most appropriate because only pandas shows this behavior

See also

penyuan commented 1 year ago

I really like how comprehensive your post is, @akazukin5151! Thank you! 👍🏼

I've used numpy before, but don't have the deep technical knowledge needed to fix the issue you described, hope others can chime in.

That said, I think a key, basic step we should take is to be upfront about this (possibly) rare issue in the documentation. Speaking of which, the README should probably be expanded into better documentation for this reference code. 😄

akazukin5151 commented 1 year ago

I don't think it's fixable in numpy. I did some searches for "numpy exact decimal", "exact fraction", and "exact rational" with nothing relevant. Python does have a decimal and fraction types, but you lose the speed and terseness of numpy.

This is indeed a big change, so I agree that we should at least document it.

penyuan commented 1 year ago

Thanks, @akazukin5151 I think issues like this are particularly important for reference implementations like this repository.

Would you (and others) consider this to be a rare, but critical issue? Or might it happen quite often?

Yeah we probably can't directly fix this in numpy. But could there be a workaround for just this part using Python's internal functions? Would using Python itself (instead of an external library) make any difference?

akazukin5151 commented 1 year ago

I think it is not rare but also not that critical. I have no evidence, but considering that you can observe floating point behavior with 0.1 + 0.2 = 0.30000000000000004, I think it is not rare. But averaging many situations together, it is "probably good enough". It depends on your use case. Actual governments using this will need all edge cases to be covered, but not everyone is a government. This example started because there was a tie, which should be rare in the real world. But all you need is a single tie and mistakes will cascade. Electing the wrong candidate does sound pretty serious to me, after all the point of investing in alternate electoral systems is better accuracy

My understanding of floating point arithmetic is that if you really need exact guarantees, you cannot rely on floating points [1]. So there's no point in using "a different float", it just makes different mistakes.

The decimal or fraction modules is in the Python standard library ("internal functions"/"using Python itself"), and they will fix it. But using plain Python floats instead of numpy floats won't make a difference.

Reading my post again, I think pandas/numpy is actually "more right" by using 6.6 instead of 6.599999999999999. But ultimately the ideal solution is to use fixed/infinite precision decimals in both Python and JS.

At least Python has it in the standard library, but JS would need an external library like decimal.js. On the other hand, the JS code is written with loops so I think it would mostly be a drop in replacement. The difficulty in fixing it in Python is because the code is written in a vectorized pandas style, which has to be rewritten to loops for the decimal type.

I'm surprised how much I've written. The code is not that long (simple implementation was one of the factors in choosing STAR after all), so it's possible to spend a week discussing it when it can be fixed in a few days. (I don't have time right now but I might consider later)

[1] https://softwareengineering.stackexchange.com/questions/382736/do-i-still-need-to-use-fixed-point-to-guarantee-computers-get-the-same-result-fo

akazukin5151 commented 1 year ago

Voter satisfaction efficiency (VSE) appears to use floats as well, I wonder how much that affects results. Hopefully not significant enough!

endolith commented 1 year ago

First, floats are inherently imprecise and should not be used in applications where they need to be tested for equality, which is what ties are. The scores, number of candidates, number of winners, number of voters are all integers, so this algorithm is inherently based on rational numbers and should calculate using those ideally.

Second, the line:

        #Select winner
        w = weighted_scores.sum().idxmax()`

doesn't have any way to deal with ties. If there are multiple candidates with the same score at this point, it will always pick the first one in the list(?) Which could lead to a systemic bias if the candidates are arranged alphabetically. How should ties be handled?

If I print the results of each round, it looks like these are actually tied, even in the floating-point representation:

Weighted scores:
0    5.80000000000000159872
1    4.59999999999999964473
2    6.59999999999999964473 [tied]
3    5.59999999999999964473
4    4.80000000000000071054
5    6.59999999999999964473 [tied]
6    6.59999999999999964473 [tied]
7    5.99999999999999911182
dtype: object

Winner: 2

Weighted scores:
0    4.20000000000000017764
1    3.40000000000000079936
3    2.89999999999999991118
4    2.99999999999999955591
5    4.50000000000000000000
6    4.69999999999999928946 [highest, no ties]
7    3.79999999999999982236
dtype: object

Winner: 6

Weighted scores:
0    1.80000000000000004441
1    1.40000000000000013323
3    1.89999999999999991118
4    1.60000000000000008882
5    1.70000000000000017764
7    2.39999999999999991118 [highest, no ties]
dtype: object

Winner: 7
[2, 6, 7]

If I convert the code to use Python Fraction objects, the results are exact, but the tie still results in the same outcome:

Weighted scores 
0    29/5
1    23/5
2    33/5 [tied]
3    28/5
4    24/5
5    33/5 [tied]
6    33/5 [tied]
7       6
dtype: object

Winner: 2

Weighted scores 
0     21/5 = 4.20
1     17/5 = 3.40
3    29/10 = 2.90
4        3
5      9/2 = 4.50
6    47/10 = 4.70 [highest, no ties]
7     19/5 = 3.80
dtype: object

Winner: 6

Weighted scores 
0      9/5 = 1.80
1      7/5 = 1.40
3    19/10 = 1.90
4      8/5 = 1.60
5    17/10 = 1.70
7     12/5 = 2.40 [highest, no ties]
dtype: object

Winner: 7
[2, 6, 7]

But I think both issues need to be dealt with. Ties need to be handled somehow, and floats should not be used.

endolith commented 1 year ago

@akazukin5151

In JS, this is 6.599999999999999. In my Python debugger and script, it is also 6.599999999999999. However when pandas actually sums it up, it becomes 6.6.

Here's what my debugger session looks like:

> /home/me/star_pr.py(187)Allocated_Score()
-> w = weighted_scores.sum().idxmax()
(Pdb) weighted_scores.sum()
0    5.8
1    4.6
2    6.6
…

It's not really 6.6; it's actually 6.599999999999999644728632119950… They're just rounded for printing as a convenience (because floats are "close enough" data type). If you need to print it more precisely, though, you can add digits:

print("\nWeighted scores:\n", weighted_scores.sum().map('{:.20f}'.format))
mikefranze commented 1 year ago

If there are multiple candidates with the same score at this point, it will always pick the first one in the list(?) Which could lead to a systemic bias if the candidates are arranged alphabetically. How should ties be handled?

This hasn't been added to the python implementation yet, but how it's done in the JS version is we make a list of all candidates with the scores equal to the max score and pick randomly from that. Currently we don't have a better ties protocol for this method.

mikefranze commented 1 year ago

If I convert the code to use Python Fraction objects, the results are exact, but the tie still results in the same outcome:

Interesting, didn't know that existed. It looks like there's a Fraction library for JS too

endolith commented 1 year ago

If there are multiple candidates with the same score at this point, it will always pick the first one in the list(?) Which could lead to a systemic bias if the candidates are arranged alphabetically. How should ties be handled?

This hasn't been added to the python implementation yet, but how it's done in the JS version is we make a list of all candidates with the scores equal to the max score and pick randomly from that. Currently we don't have a better ties protocol for this method.

Ok. In my single winner implementation I have three options:

I would suggest doing it the second way: Specify that the list of candidates be randomized before tabulating the election, and then favor the first one in the list in case of ties, so it's deterministic but fair?

endolith commented 1 year ago

Here I've got the version with Fraction objects: https://github.com/Equal-Vote/starpy/compare/main...endolith:starpy:I_got_98.99999_problems

tim-one commented 1 year ago

Fraction objects are the way to go. Then, e.g., you could add the very valuable (for both internal documentation purposes and for catching logic errors) assertion that, at the start of a round the sum of the ballot weights is exactly equal to the quota times the number of winners remaining to be picked.

IMO it would be good to get away from numpy/Pandas here regardless. Their use greatly reduces the audience able to read the code, and buys approximately nothing worth getting: the code is not complex and the computational expense is minor, Simple pure-Python loops are more than up to the task of specifying the exact computation clearly.

endolith commented 1 year ago

Using Fraction objects removes the vectorization anyway.

Though the ballots are inherently a 2D object which pure Python doesn't really support, so removing numpy might actually make it less readable?

larryhastings commented 1 year ago

I agree with tim-one.

I used Fraction objects in my implementation:

https://github.com/larryhastings/starvote

It does slow things down a bit; I repurposed one of the bigger test elections I could download from https://star.vote/ to test Bloc STAR and Allocated Score, and there's a perceptible pause before it produces results. I don't think anybody would actually care about the performance impact, especially if you tell them "it's a little slower because it's 100% accurate".

tim-one commented 1 year ago

Though the ballots are inherently a 2D object which pure Python doesn't really support, so removing numpy might actually make it less readable?

A ballot is a mapping of candidates to scores, which is very clearly expressed as a Python dict. Then, e.g., there's no added obfuscation due to needing to remember which axis means "candidates" and which "scores". But rather than a dict, it's clearer still to use Ballot objects, which contain a candidate-to-score dict. Then also have a score(self, cand) method, which, applies the ballot's current weight (a member of the Ballot object) to the dict's score. Sticking to that, then it's also impossible to "forget" to use weighted scores instead of raw scores.

That's a bit slower than 2D float array indexing too, but IMO also clearer on all counts. BTW, note that a dict can be indexed directly by a candidate's name (as a string) - no need to map candidates to "little integers", another little layer of obfuscation needed when working with arrays.

larryhastings commented 1 year ago

[...] it's clearer still to use Ballot objects, which [...] also have a score(self, cand) method

FWIW, I didn't use Ballot objects in my starvote library. I didn't want to inflict a custom object on my users, and I felt like I didn't need the abstraction internally.

After all, the score method described above is a one-liner:

return self.ballot_dict.get(cand, 0)

And I have functions to compute scores on iterables of dicts. So I don't have that many places that directly access a score. I just manually operate on the dict and hand-inline the get method call as needed.

But I would say, either operate 100% of the time on dict objects, or operate 100% of the time on custom Ballot objects. Consistency is important--don't make your tabulation code handle both.

p.s.

it's clearer still to use Ballot objects, which contain a candidate-to-score dict

I'd actually opt for a Ballot that inherited from dict. The only difference I can think of between a Ballot object and a dict is the addition of the score method. I don't prefer "is-a" over "has-a" often but in this case it seems reasonable.

tim-one commented 1 year ago

After all, the score method described above is a one-liner:

return self.ballot_dict.get(cand, 0)

I don't understand how that can work unless you're also mutating the dict to keep on reducing the scores by the ballot's reweighting factor. My Ballot.score() method:

def score(self, cand):
    assert cand in candset
    return self.weight * self.raw.get(cand, _zero)

where _zero is a private module constant for Fraction(0). The dict (raw) is never mutated after construction, so, e.g., it's fine if a million ballots with the same scores all use the same dict object.

I'd actually opt for a Ballot that inherited from dict

Why? I'm aiming to maximize clarity, not cleverness :wink:. My Ballot objects also, e.g., have a merit(self, winners) method to score a ballot against a candidate set of winners, for use in a non-sequential RRV election method.

Ballots are fundamental to all election schemes - modeling them with a Ballot object is just screamingly natural to me.

endolith commented 1 year ago

BTW, note that a dict can be indexed directly by a candidate's name (as a string) - no need to map candidates to "little integers", another little layer of obfuscation needed when working with arrays.

For deterministic replication in case of ties, I think pre-randomizing the list of candidates and assigning integers to them would be more reliable/trustworthy. Especially since names can have special characters, multiple candidates can have the same name, etc. Python has ordered dicts now, so we could count on that ordering for the tiebreaker(?), but I think explicit numbering is more clear.

I'm aiming to maximize clarity, not cleverness 😉.

Inheriting from a conceptually-similar established type is good for both clarity and minimizing bugs, no?

Sort of related:

https://electowiki.org/wiki/Talk:Allocated_Score#Frustrating

What the page desperately needs is an example. Say, a simple fake election, with 3 to 5 candidates and 1000 voters (or scores adding up to 1000 maybe?), and going step by step through each round, showing the entire process.

Sadly, it doesn't have that. It has Python code, which I for one find totally unhelpful because it involves library functions and I don't know what they do. And anyway, one shouldn't have to read code to understand an electoral system. Not everybody can read code.

larryhastings commented 1 year ago

I don't understand how that can work unless you're also mutating the dict to keep on reducing the scores by the ballot's reweighting factor.

For electoral systems with weighting I internally create a second dict and keep the two paired with each other. I often need other random other bits of per-ballot data, e.g. a weight, for RRV a total of maximum_score + score_contributed_so_far, for SSS the number of stars the ballot has yet to spend. Rather than festooning a Ballot object with all that, I lash all those disparate objects together into a small array, the contents of which are ad-hoc depending on the needs of the election system. This all seemed fine, and was okay to work with, but if I got a PR "cleaning up" this approach in favor of a custom object or something I suppose I'd consider it.

For deterministic replication in case of ties, I think pre-randomizing the list of candidates and assigning integers to them would be more reliable/trustworthy.

I don't think the integers thing is necessary, just pass in a list of candidates. My starvote library supports that; if you pass in a pre-randomized list of candidates to the predefined_permutation_tiebreaker, the election is 100% deterministic.

multiple candidates can have the same name

I had a "register candidate" step in an early prototype of my library but I abandoned it. I felt like a lot of that sort of data validation should live external to my tabulation code--the code compiling the ballots should do its own validation. And removing it made my library easier to use.

FWIW I'd handle that by creating a Candidate object, which contained both the name and a serial number, and make their __hash__ and __str__ incorporate both those values. Then I'd use those Candidate objects in the ballots.

tim-one commented 1 year ago

think pre-randomizing the list of candidates and assigning integers to them would be more reliable/trustworthy. Especially since names can have special characters, multiple candidates can have the same

I don't want to get too deep in largely irrelevant weeds here. Nobody is going to run an election with exactly the same names for multiple candidates. They're going to add some marking to distinguish them. Fine. A Ballot object can use the same markings in its string names. "Special characters" are also irrelevant - Python strings can hold anything Unicode can express.

This is my find-winner-and-break-ties code:

            c2s = {c : sum(b.score(c) for b in self.ballots)
                   for c in avail}
            highest = max(c2s.values())
            winners = [c for c, s in c2s.items() if s == highest]
            if len(winners) > 1:
                print("tied", winners)
            winner = min(winners, key=lambda c: cands.index(c))
            print("winner", winner)

cands is a list of all candidates. If there's a tie for highest score, the winner is whoever appears first in that list. It's up to the user to supply that list, in whatever order they want.

If they want, e.g., an election official to predetermine a preference order for ties, fine. If they want to use random.shuffle(), fine too. Or, in case of ties, they could temporarily pause the election, wait until the next business day, and seed a cryptographic hash function with the open, high, and low values reached by SPY (the most heavily traded security in the world) during the first trading hour. That can then be used in various ways to assign a unique totally ordered value, but wholly unpredictable in advance, to each candidate. Etc. :wink:.

Inheriting from a conceptually-similar established type is good for both clarity and minimizing bugs, no?

Depends on the context. In this context I emphatically do not want the scoring dicts to mutate after creation. Inherit from dict, and then all the ways of spelling "mutate this dict" come along for the ride (unless explicitly overridden in the subclass), Yes, a ballot has a dict,, but it also has a weight, and only weighted scores are of any use. Stick to using the score() method and it's not possible to make a mistake.

akazukin5151 commented 1 year ago

Seems like we've already decided removing numpy (and pandas too?), but I'll toss in my two cents as well. There's always one good reason to remove a dependency in Python - that's one less thing beginners have to trip over because of the horrible dependency management in the ecosystem.


Sadly, it doesn't have that. It has Python code, which I for one find totally unhelpful because it involves library functions and I don't know what they do. And anyway, one shouldn't have to read code to understand an electoral system. Not everybody can read code.

I had this problem too, I had to run the code in a debugger to understand what pandas is doing, because it's really terse and the pandas docs are also relatively sparse on examples. Terser code might make the method look simpler, but all it does is to make it harder to understand


The discussion about Ballot objects seems a bit off topic but I'll comment as well. I don't use Python for my personal needs, so I don't have a horse in this race. My general advice is to avoid inheritance in favour of composition. I would favour having a Ballot class/struct/record over a plain dict. But that's also because in the language I use, structs are "just data" with no overhead after compilation (not even heap allocation overhead - it's stored on the stack). It also gives type safety at compile time (ensuring an ill-formed dict cannot be passed in), but that involves boilerplate in Python.

Nevertheless, a class/struct signals intent that some data (and behaviour) is semantically grouped together. I think that's a net benefit as long as you aren't making a spagetti mess.

tim-one commented 1 year ago

I would favour having a Ballot class/struct/record over a plain dict.

Because these are weighted ballots, a plain dict is useless unless you're endlessly mutating all the dict's values to reflect the latest reweighting. So you either maintain a list of ballots and a list of corresponding weights "in parallel but implicitly", or you do a sane :wink: thing and recognize up front that - no - a ballot is not just another name for a dict. It's more than a dict.

To me, it makes no more sense to make a ballot a subclass of dict than to make it a subclass of Fraction (because it's equally as much "a rational" (weight) as it is "a dict").

Keep it simple and obvious. Larry said he "didn't want to inflict a custom object on my users", but nothing says the class needs to be exposed to users at all. As a developer, it's handy for me to construct example elections by creating Ballots by hand, but users will almost certainly read ballot data out of some kind of structured text file, and have no need to know anything about how the code stores them.

Terser code might make the method look simpler, but all it does is to make it harder to understand

Agreed but with a qualification: it depends on lot on background. For example, one of the all-time champs for brevity is APL, which makes Pandas look grossly verbose. Actual APL experts find this brevity helpful, but APL code is flatly incomprehensible to everyone else - by far the bulk of the world.

It's not necessarily the case that the terser code is simpler either. For example, to locate scores at the cusp it does a running cumulative sum. But after sorting the ballots in order of the scores they give to the winner, it's simpler to iterate:

  1. If the list is empty, quit. Else look at the rightmost score. That's the highest (by definition of "sorted").
  2. Use very efficient binary search (Python's bisect.bisect_left()) to find the leftmost ballot with the same score.
  3. All the ballots between these (including the endpoints) are treated exactly alike. If the sum of their weights is less than or equal to the quota remaining, physically throw away all of them (constant-time in plain old Python to delete a suffix of a list). Then subtract the sum of their weights from the quota remaining, and go back to step 1. Else reweight them and quit.

That's "wordier" for sure, but each step is simple to understand on its own, and quite explicit. As a bonus, the more seats there are to fill, the more likely mounds of ballots will be thrown away in step 3, making the code faster and faster as the number of ballots remaining to look at decreases (whereas the Pandas code spends an ever-higher percentage of its time computing useless 0 scores for "conceptually - but not actually - discarded" ballots).

akazukin5151 commented 1 year ago

Fixed by #303, closing

Edit: wait, that was for a different repo...whatever I'll use that

endolith commented 1 year ago

So this isn't fixed in starpy?

mikefranze commented 1 year ago

Sorry yea this was fixed in typescript, haven't gotten around to python yet.

masiarek commented 1 year ago

I think issue was closed by mistake - hence I am re-opening the issue.

See "fractions" solution at https://github.com/larryhastings/starvote.

On a separate note / topic (maybe I should open another issue / topic): Larry's solution is so much superior to our current solution (not only fractions) - hence I propose to request Larry's permission to refresh our STARPY solution using Larry's code as a foundation.