Open robla opened 3 years ago
I echo @simberaj' call for a BNF (or similar) specification
I have attempted to write a parser implementation fully compliant with the current ABIF test suite. It can be found at https://github.com/simberaj/votelib/blob/abif/votelib/io/abif.py; the test suite is at https://github.com/simberaj/votelib/blob/abif/tests/io/test_abif.py. The loaders produce Python dictionaries used in the rest of the library to represent votes, mapping preferences to counts. Implementing the spec, I used a couple of assumptions that seem natural to me and IMHO should be included in the spec; I'm listing those below. Wherever I was not so sure, I'm opening a new issue in this tracker.
[A-Za-z]
) are allowed as candidate tokens, i.e. no spaces.Any comments on the implementation (as opposed to the spec itself) more than welcome in votelib's issue tracker at https://github.com/simberaj/votelib/issues/51!
I also wrote a parser using Lark. It took me a little bit longer to write the parser than I liked, mostly due to me getting distracted with other matters rather than any problem with Lark. I'm happy with the direction this takes my work in. Lark is a Python library that uses EBNF as its input. Though Lark is specific to Python, the EBNF format is a language-agnostic format which satisfies the request for a BNF specification associated with ABIF.
The first version of the EBNF is available here: https://github.com/electorama/abif/blob/main/abif-v0.01.ebnf
Hi @robla, I have gone through your EBNF finally. A great job indeed! It seems this could indeed serve as an important part of the specification of ABIF, but I have some questions regarding minor differences between it and my understanding as reflected in the votelib implementation:
=A: [Vít Rakušan]
while votelib expects [Vít Rakušan]: A
, which is IMHO simpler while retaining the benefit of being able to determine the line type from the first byteA, B > C = D
. Is this desired and if so, what is the intended semantics?Some formal remarks:
abifline
token to something reflective of the fact it might span multiple lines.@simberaj - Thanks for the reminder to work on ABIF! I've let myself get distracted by other matters (e.g. I started doing some serious Perl development for the first time in years). I think I understand what you're suggesting, but I think I want to follow the following steps before making changes:
abif.ebnf
file) so that running pytest
from the top-level directory doesn't return any failures. Of course, there will be invalid ABIF files in the "testcases/" directory, but that's kind of the point of a good test suite.Seriously, though, I'll take a closer look at things later. I've got a few other things to take care of in my personal life that are interfering with my Python-scripting time, but I'll hopefully have more time very soon. We should also work out how issue #16 is going to work, so that I can start accepting pull requests.
Folks who have been around the electoral reform community for a while seem pretty interested in this format, but as of 2021-06-06, there isn't really a specification. There's only a few wiki pages, some test cases and a few online discussions about the format. It's really similar to ad hoc formats that have been around for 25 years or so, but we should actually have a written record of what we're up to.