Closed zsau closed 10 years ago
That's the problem with quick hacks: Commit b8ae7b2bda59de341153d18e976b8b54568a19bf introduced a more lenient parser for separated tokens in the codec repeated
: The last separator can be omitted. Sadly, this breaks the assumption, that the binary after the roundtrip "read->write" stays the same. The codec always writes a separator, even if it is the last element. That means given your example above the codec tries to write a 0 after writing "ghi". That results in 12 bytes overall. Since this is bigger than the padding, an error occurs.
The error message looks wrong, though. It should say "...attempting to write 1 bytes more...". I should also update the documentation of repeated
with :separator
to state, that the last separator is only optional when reading, not when writing.
The problem is: When reading for example zero ended strings, how do we know, that the last string has been read completely? Right now we just take the EOF as a signal to stop reading. When writing we could omit the last separator only if we are sure, that there is no other codec writing data afterwards. Because if there would be another one, the next read of this data will fail (this repeated
codec would continue reading past the written string boundary since there is no separator token).
In my case, I'm working with a field that has a fixed length and contains some number of null-separated strings (but no final null). I can parse it perfectly thanks to b8ae7b2bda59de341153d18e976b8b54568a19bf, but I can't write the same data with the same codec right now (and this field is buried several levels deep in the overall protocol, so it's not trivial to just swap in a different codec for writing the field).
Ideally, repeated
would omit the final separator when it runs out of space to write. Unfortunately, OutputStream
doesn't have any concept of capacity, so there's no way for padding
to communicate that limit to repeated
given the current API, right?
What if repeated
had an option to omit the final separator when writing? Alternatively, what if padding
had an option to truncate its contents when writing? Either one would allow round-tripping with this type of protocol.
Another option would be for repeated
to have two similar options, say :separator
and :terminator
, where the former means the byte comes between each pair of objects and the latter means it comes after each object.
Optionally truncating the bytes written by padding
seems to be the cleanest way, imo. Does this work for your case?
That's perfect. Thanks!
(Related to Issue #3)
I'm having trouble encoding a fixed-length string sequence while omitting the final separator:
The content should be exactly 11 bytes without the trailing null, but it seems the encoder doesn't like this.