Closed AddressXception closed 4 years ago
Some additional thoughts on this, since what we're building is the near-mythical "public bulletin board" that shows up in every crypto-voting-scheme ever made.
We may have a million or more encrypted ballots, and we want "random access" to them (e.g., a web server that you give the ID of a ballot and it returns the ciphertext). This suggests that we cannot simply write out a single, enormous JSON file with a list of the ballots. However, we could definitely write out one file per ballot.
123456.json
), you could store them as 12/34/123456.json
.If we stick with JSON for the on-disk representation, which is perfectly reasonable, we probably want to compress it. Python has support for several different compression algorithms (https://docs.python.org/3/library/archiving.html). Security issues could crop up here if the compression code is written in C, so some advance auditing would be relevant. On the other hand, if we go with a binary format (msgpack, protobufs, etc.), then this issue goes away.
So those JSON files have individual encrypted ballots. We still need all the metadata. That probably goes in a "main" JSON file of some sort, which then includes SHA256 hashes of the individual encrypted ballot files. The main JSON file could itself then be digitally signed with conventional tools, or maybe the hash of the main file is published by the election officials and we're done. No need for digital signatures at all?
Determine the proper format for representing output data from the system.
This was brought up as part of this: https://github.com/microsoft/electionguard-python/pull/1#discussion_r412531622
One consideration is the representation of
ElementModP
andElementModQ
which will require some custom parsing to properly represent these values