skidooesy / stv

Automatically exported from code.google.com/p/stv
0 stars 0 forks source link

Eliminate limit of 255 candidates for BLT files #16

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Currently, candidate numbers are stored as bytes to conserve memory for large 
elections. 
Will keep this limit for TXT files but will remove it for BLT files. If the 
number of candidates is 
greater than 255 then ints will be used instead of bytes for storing candidate 
numbers. 

Original issue reported on code.google.com by jeff.oneill on 6 Jul 2009 at 1:15

GoogleCodeExporter commented 9 years ago
Why should the # of candidates differ for text vs. BLT files?  Is the 
description
still current, given the separation of Ballots objects from LoaderPlugin's?

Original comment by dan.keshet@gmail.com on 12 Jul 2009 at 5:04

GoogleCodeExporter commented 9 years ago
Yes, I still think the limit should apply to text files and not to BLT files.

The limit should rarely be an issue as 255 is a huge number of candidates for 
most
elections.  If you have a big election, you are better off with a BLT file.  I 
think
of text files as being more for people experimenting with various voting 
methods who
want a quick way to draft a ballot file.

For text files, you also don't know the number of candidates until you have 
processed
the entire file.  You'd have to process the entire file twice to allow this an
option.  I don't think it is worth the effort to provide that functionality.

In case the original post was not clear, I think the default for BLT files 
should be
to used bytes for candidate numbers, but if the first line of the file 
indicates more
than 255 candidates, then ints should be used instead.

Let me know if you think otherwise.

Original comment by jeff.oneill on 12 Jul 2009 at 4:41

GoogleCodeExporter commented 9 years ago
Jonathan suggested on-list that this might all be premature optimization; if 
nobody
has a problem with the amount of memory, than why impose any limit.  I'm not 
really
sure; I haven't tried large datasets enough to know who's using this that might 
be
harmed by poor memory usage.

If we used this list to do what you say, text files with >255 candidates could 
just
have a mix of byte arrays and int arrays:

  def createBallot(self, orderedCandidateNumbers):
    if self.nCand > 255:
      b = array("I")
    else
      b = array("B")
    for candidate in orderedCandidateNumbers:
      if candidate < 0 or candidate > self.nCand - 1:
        raise RuntimeError, ("""No candidate with number %d""" % (candidate))
      b.append(candidate)
    return b

Original comment by dan.keshet@gmail.com on 12 Jul 2009 at 5:01

GoogleCodeExporter commented 9 years ago
I believe that in addition to conserving memory, it will also speed up the 
program. 
A 25 MB program should run faster than a 100 MB one.  Back when I changed the 
storage
from ints to bytes, I recall a big improvement.  It would be easy to run such a
comparison again.

Even if it is a relatively small improvement in size and speed, it seems that 
it is a
very self-contained and easy to implement feature (at least as suggested by Dan
above) so that it would be worthwhile.

Original comment by jeff.oneill on 12 Jul 2009 at 5:55

GoogleCodeExporter commented 9 years ago

Original comment by jeff.oneill on 20 Aug 2009 at 9:19