Syrup is a lightweight and easy-to-develop (and reasonably easy to read) serialization of [[https://preserves.gitlab.io/preserves/][Preserves]]. You can also think of it as an extension of the ideas of [[https://people.csail.mit.edu/rivest/Sexp.txt][canonical s-expressions]] and [[https://en.wikipedia.org/wiki/Bencode][Bencode]].
For the sake of simplicity, let's look at examples in Python.
We've got bytestrings and strings and integers and booleans:
from syrup import syrup_encode, syrup_decode, Symbol
syrup_encode(b"a bytestring") b'12:a bytestring'
syrup_encode("a string") b'8"a string'
syrup_encode(Symbol('foo')) b"3'foo"
syrup_encode(42) b'42+' syrup_encode(0) b'0+' syrup_encode(-123) b'123-'
syrup_encode(123.456) b'D@^\xdd/\x1a\x9f\xbew'
syrup_encode(True) b't' syrup_encode(False) b'f'
+END_SRC
But maybe you want to combine things. So ok, we have lists, sets, and dictionaries:
syrup_encode(["foo", 123, True]) b'[3"foo123+t]'
syrup_encode({"species": "cat", ... "name": "Tabatha", ... "age": 12}) b'{3"age12+4"name7"Tabatha7"species3"cat}'
syrup_encode({"cookie", "milk", "napkin"}) b'#4"milk6"cookie6"napkin$'
+END_SRC
When reading, whitespace is ignored (but we recommend using the official [[https://preserves.gitlab.io/preserves/][Preserves]] syntax if you're trying to make things human-readable).
syrup_decode(b'{3"age 12+ 4"name 7"Tabatha 7"species 3"cat}') {'age': 12, 'name': 'Tabatha', 'species': 'cat'}
+END_SRC
Arbitrary nesting is allowed, including in keys and sets, if the platform supports it.
Sets and dictionaries are both ordered, dictionaries by keys. Items and keys are sorted by comparing them in sorted form. (This might make Tony Garnock-Jones say, "yuck!") But in this way, if two applications both agree to use Syrup, it is an acceptable form for canonicalization.
Maybe this seems almost good enough, but you need your own types. For this purpose, Syrup also provides a Record type:
from syrup import record
syrup_encode(record('isodate', '2020-05-01T14:08:11')) b'<4"date19"2020-05-01T14:08:11>'
syrup_encode(record('date', 2020, 5, 1, 14, 8, 11)) b'<4"date2020+5+1+14+8+11+>'
+END_SRC
Here's nearly everything you need to know, taken right from a comment in the Racket implementation:
;; Booleans: t or f
;; Single flonum: F
(Sorry, records look a bit confusing there, since =<>= are actually used rather than just a placeholder for a variable.)
There's only one other key detail. Writing out a Syrup structure should always canonicalize it. The good news: this is fairly easy to do via recursion, since only dictionaries and sets are unordered. Dictionaries are ordered by their keys, sets by their items (and dictionaries must not include the same key twice). Simply write out the keys/items first, then sort them by the bytes, from lower to higher.
That's it, really. Easy-peasy.
In comparison to [[https://people.csail.mit.edu/rivest/Sexp.txt][canonical s-expressions]], syrup uses similar syntax when limited to lists and bytestrings, except that in writing uses =[]= rather than =()=. But as you can see above, it also defines a lot of other types, too.
Syrup is also similar to [[https://en.wikipedia.org/wiki/Bencode][Bencode]], in that it supports integers, bytestrings, lists, and dictionaries. However its syntax for lists and dictionaries are different when syrup is written.
For funsies, the syrup decoders that ship in this repository will accept Bencode or canonical s-expression syntax (except for canonical s-expression "display hints" syntax, but those never made sense anyway... use records instead).
syrup_decode(b'd3:agei12e4:name5:Missy7:species3:cate') {b'age': 12, b'name': b'Missy', b'species': b'cat'}
+END_SRC
But, Syrup uses ={}= instead of =de= for dictionaries when encoding itself.
syrup_encode({b'age': 12, b'name': b'Missy', b'species': b'cat'}) b'{3:agei12e4:name5:Missy7:species3:cat}'
+END_SRC
Implementations in [[file:./impls/][impls/]] subdirectory:
External implementations:
Apache v2