Closed jessecarl closed 7 years ago
Consider the following:
time.Time.MarshalBinary()
implementation) is 15 bytestime.Time
implementation, a version frame would be beneficialUnmarshalBinary()
time.Time
implementation is nice, but are we willing to rely on it not changing? It would need to remain stable for other language implementations to work.time.Time
is big-endian, and most network protocols are as well, we should probably be big-endian in our implementation.Need to consider the testing strategy we want to employ. I'm inclined to emulate the strategy employed by the time package: test Marshal to Unmarshal, test known bad Marshal, and test bad Unmarshal.
Let's consider using the following for the magic numbers in our format here: 0x90,0xe9
I'm sure I'm not the only one to use two consecutive Fibonacci numbers like 144,233
, but I couldn't think of anything better.
My current thinking:
0:1
-- the magic bytes 0x90 0xe9
2
-- the version 0x00
for the first version3
-- length of FriendlyName
(max of 212 to fit the whole message in 256 bytes?)4
-- XOR of the header so far5:7
-- padding or offset8:23
-- ID
24
-- Status
25:39
-- Since
(using time.Time.MarshalBinary
)40:251
-- FriendlyName
(utf8 encoded)252:255
-- Adler-32 checksum of message so far (big-endian)This is probably more costly than a naive implementation. The addition of magic bytes provides a quick check that the message is supposed to be a Resource
. The version, which may be split in order to provide distinct actions (New, Save, Fetch, etc.) allows a decoder to know how to handle the rest of the message. Put the length of the FriendlyName
in the header because it is the only variable-length property of the Resource
. Do a simple XOR checksum of the header so far for a quick sanity check. Add some padding to the end of the header as another sanity check, allowing the message itself to start after exactly eight bytes. After the message data, add the Adler-32 checksum for a final sanity check and a maximum message size of 256 bytes.
Again, this is probably overkill for most applications, but the binary format is most likely going to be used on less reliable hardware and networks in embedded devices.
Instead of the Adler-32 checksum, which is not ideal for small messages, use CRC32 instead. With a message this small, speed shouldn't be much of an issue.
Step back a second.
We'll likely implement an adapter for the formats that require all this extra stuff. I'm going to pull this back again.
A second proposal:
0:1
– magic bytes2
– version3
– FriendlyName
length4:19
– ID
20
– Status
21:35
– Since
36:
– FriendlyName
utf8 bytesI think this should provide a compact, easy to parse message.
For MarshalBinary
(independent of UnmarshalBinary
):
FriendlyName
4 + 16 + 1 + 15 + len(FriendlyName)
For UnmarshalBinary
(independent of MarshalBinary
):
For MarshalBinary
together with UnmarshalBinary
:
Tests still need to be added to verify a few edge cases and invalid properties.
I completed a few additional test cases for bad data.
Implement the binary marshal and unmarshal methods. In order to send, receive, and store efficiently, a custom binary format would be useful. It would especially be useful for communicating with clients in other languages. It should be able to be parsed quickly in most languages.