electorama / abif

The _Aggregated Ballot Information Format_ provides a concise, aggregated, text-based document to describe the ballots cast in range-based or ranked elections, as well as approval-based and choose-one balloting systems.
Other
4 stars 1 forks source link

Quantity delimiter: Asterisk or colon? #3

Closed robla closed 3 years ago

robla commented 3 years ago

We need to pick a delimeter as a preferred delimiter between quantities and ballot ordering/rating in each line of an ABIF file. Jan Šimbera suggested on the EM list that we should consider allowing asterisk ("*") as an optional replacement for colon (":") for Pivot compatibility. See [EM] Ballot Data Format for Jan's original message.

In my mind, we may possibly also include an optional delimeter. My current preference for what a compliant implementation needs to support will be expressed in IETF terms:

... reader/parser writer
colon MUST SHOULD
asterisk MAY SHOULD CONSIDER

My concerns with using asterisk:

  1. I really want to be able to strip whitespace from this format for everything outside of square brackets (i.e. "[Jan Šimbera]" should be okay, but outside of square brackets, Jan as a candidate should not have spaces, so "JanŠimbera" might be an acceptable candidate token, and "J" almost certainly will be). For colon, it's easy to cram a line together ("27:DGM/5,SBJ/2,SY/1,AM/0") but for asterisk, it gets difficult to see what is happening ("27*DGM/5,SBJ/2,SY/1,AM/0").

  2. It's difficult (for me) not to automatically try to apply the order of operations when I see asterisk in software. An asterisk needs to be surrounded by spaces to make it clear that it's not intended to be used as a footnote (like the dagger † and double-dagger ‡ frequently are as well).  Many fonts display asterisk as superscripted, which makes it a difficult-to-read replacement for the multiplication symbol ("×") for non-programmers.

simberaj commented 3 years ago

I agree with the preference of colon for output and the proposed IETF term classification along the lines of the above reasoning which supports it very well.

brainbuz commented 3 years ago

I would stick with the colon. Allowing alternative notation uses up symbols that might be needed for something in the future and creates more work for programmers implementing the format.

carlschroedl commented 3 years ago

Hi! I am one of the main Pivot Libre contributors. I have no strong preference for asterisks over colons. I find them similarly intuitive delimiters. It wouldn't be hard to adjust Pivot Libre software to use a colon. I have a moderate preference for the standard to use one delimiter to make it easier to implement parsers.

simberaj commented 3 years ago

The votelib parser currently supports both, but as long as Pivot Libre compatibility is not threatened, I'm OK with sticking to the colon only.

endolith commented 3 years ago

See https://github.com/pivot-libre/pivot-libre.github.io/issues/7

robla commented 3 years ago

Per https://github.com/pivot-libre/pivot-libre.github.io/issues/7, we should probably resolve this reasonably quickly. I plan to completely make up my mind this issue by EOD November 7 (end of this coming Sunday in Pacific Time, which will be "PST" rather than "PDT", so set your clocks accordingly). I may make my mind up sooner than that, and resolve this issue (#3) here on Github. My current preference is colon (":") is the strongly preferred delimiter, and that others (spaces, tabs, colons, etc) are optional. Just what "optional" means is left as an exercise for the reader. Speaking of reading, I'll spell out my wordy rationale below.

Rationale for strongly-discouraged optional features

Here's some robla-biased history of the W3C, IETF and XHTML which I could go on for a very long time about. One thing that the IETF used to frequently published was "considered harmful" documents (but W3C published at least one). Basically, when the IETF published a "considered harmful" document about a feature, it meant "don't use it".

The IETF created a sarcastic taxonomy of compliance levels with specific features ("RFC 6919"), which I referenced in the description of this issue. They also published a more serious document ("RFC 2119"), which I also referenced. No one has published a ranking (or a set of ratings) associated with the following IETF capitalized keywords:

My sense of things is that an ABIF file REALLY SHOULD use colon (":"). However, "MUST" seem too strong, because I'm not going to come to your house and bust your kneecaps if you accidentally forget to put ":" in a message to EM-list with an "ABIF" example in it. Is leaving colons out bad? No, most readers will be able to see what's going on. If one is writing an ABIF implementation, should it not accept colon (":") as a delimiter between ballot quantities and ballot rankings (or ratings, as the case may be)? I would consider that "harmful" to say that your implementation is "ABIF-compliant", for what it's worth. I've seen way too many text files with rankings in them over the years (which use ":" as the delimiter) and way too much software offers ":" as the delimiter between quantity and rankings.

What about spaces/tabs/whitespace?

I think that should be "STRONGLY DISCOURAGED". However, I guess there's some old-school files that were created before ABIF was a thing that ... well.... sure....whatevs. Don't worry about fixing those.

brainbuz commented 3 years ago

In a lot of the discussion the colon has been used, and I see no reason to prefer something else and I'm proposing to use * for something else.

I started writing a draft for use as a starting point for the first draft of the standard. Its not ready to be a PR, but I invite you to look at it. One place I did go out on a limb was to propose the * as punctuation to indicate a weighted ballot in place of the '/' for Cardinal Ballots.

You can see my draft for a draft in my fork of ABIF.

https://github.com/brainbuz/abif

robla commented 3 years ago

Thanks for highlighting one of the bigger substantive changes in your fork, @brainbuz . Keeping this issue focused on "what should be the delimiter between ballot quantity and ballot rankings/ratings for a group of ballots: I think I agree that ":" should be strongly preferred over other delimiters. The only question is whether we break compatibility with Pivot Libre (per https://github.com/pivot-libre/pivot-libre.github.io/issues/7 ) to implement your suggestion. If Pivot Libre standardizes on colon (":") and deprecates use of asterisk ("*") for this delimiter, then it becomes easier to use asterisk for other purposes (like ratings, as you suggest). More food for thought.

brainbuz commented 3 years ago

As the author of another voting library that has formats which will be deprecated the moment ABIF becomes official, I do not think backwards compatibility with other formats should be a concern.

robla commented 3 years ago

As the author of another voting library that has formats which will be deprecated the moment ABIF becomes official, I do not think backwards compatibility with other formats should be a concern.

Per my new thread (#21) over in "Discussions" titled "Backwards compatibility", I think backwards compatibility should be a concern. We can have that philosophical discussion over there.

Speaking specifically about asterisk ("*") vs colon (":"), I would love for ABIF to have mutual compatibility with @carlschroedl 's and @simberaj 's software. Both of them seem to be suggesting that compatibility with asterisk can be broken. I'd love to get their opinion as to whether it should be broken.

simberaj commented 3 years ago

I am in favor of removing the asterisk support. It does not seem to be widely used nor intuitive to me. If anything, I would consider having the colon optional.

carlschroedl commented 3 years ago

I now favor removing the asterisk as a quantity delimiter given there's a chance it could be used for something else. It wouldn't be hard for Pivot to update its code and data to accommodate.

robla commented 3 years ago

I'm going to resolve this. The spec that I write for ABIF is almost certainly going to use colon (:) and not allow asterisk (*) as a delimiter between ballot quantities and candidate rankings/ratiings.