electorama / abif

The _Aggregated Ballot Information Format_ provides a concise, aggregated, text-based document to describe the ballots cast in range-based or ranked elections, as well as approval-based and choose-one balloting systems.
Other
4 stars 1 forks source link

Should ABIF be text or serialized? #4

Closed brainbuz closed 3 years ago

brainbuz commented 3 years ago

Robla's cases proposed on electowiki are in a text format.

While this style of format may be more comfortable for human reading and hand editing, a serialized (JSON/YAML) format is easier for programmers to implement, because these formats import directly to data structures using tools available in every programming langauge. YAML and pretty JSON can be as readable as the text format.

The downside is that different structures are suited to different ballot types which then makes the specification larger. Standard RCV is easily represented with an array, while range needs key value pairs, RCV can also be done with key value pairs by inverting (best is 1 etc) and this would support equal ranking in RCV as well.

Another option is to create a text and serial version of the format within the spec. The files could be differentiated either with the file extension or by testing the first line since YAML files should begin ---, while JSON would begin with {, the text format can be specified to begin ABIF as the first line. Parsers would be required to check the first line.

The dual spec would allow users to decide which they preferred, it would also allow programmers on all platforms to take advantage of external format converters to bring the data from text to serial format that they can load without writing an importer.

robla commented 3 years ago

I'm tempted to repeat much of what I wrote in response to /u/paretoman's comment back on May 28. In short, I don't believe that internal-software data model interoperability is as useful as a text interoperability format. The format should be interoperable both with many pieces of software, and with many human brains interacting with the raw text, and we might accidentally make it harder for software developers by micromanaging the data representation that they use with the format.

There have been many YAML-based and JSON-based formats over the past decades. I used to be believe that a JSON-based format would foster interoperabililty (per my work on Electowidget ), but I've come to realize from decades of hard-won experience that people who understand electoral systems can write text to describe electoral scenarios (and have been doing so on the election-methods mailing list for over 25 years), but that converting their thoughts into JSON or YAML or some other serialization format is obnoxiously difficult. I believe that the center-squeeze effect exhibited by some electoral tallying methods is simply expressed by the following ABIF:

35: A>B>C
25: B>A>C
40: C>B>A

Rather than asking people who understand game theory to also understand how to write valid JSON or YAML, it seems easier to teach programmers how to write a few basic regular expressions, and to ensure that the ABIF format doesn't have a lot of complicated one-off requirements that make parsing ABIF too difficult.

Writing an importer will be required for all software, whether it's an importer for a piece of JSON with an unfamiliar data structure or a piece of text with an unfamiliar structure.

carlschroedl commented 3 years ago

It is valuable to standardize how humans and machines exchange information. It is valuable to standardize how machines exchange information. The two standards can be different.

Since some well-resourced past efforts already produced official standards for exchanging many types of election data between machines, I suggest that we focus our sparse volunteer time on providing unique value through standardizing human-machine exchange.

robla commented 3 years ago

ABIF is decidedly a text format at this point. Moreover, it's decidedly a line-oriented text format. I think discussions about how the text format may (or may not) translate into data structurs should move over to issue #15 ("Define a core data model for ABIF")