microsoft / yardl

Tooling for streaming instrument data
https://microsoft.github.io/yardl/
MIT License
29 stars 5 forks source link

Serialized stream lacks terminating byte if you don't use a protocol Writer as a context manager in Python #137

Closed naegelejd closed 3 months ago

naegelejd commented 4 months ago

In Python, the binary protocol Writer is meant to be used as a context manager, e.g.

with MyProtocolWriter(filename) as w:
   w.write...

It is also possible to use the class directly and manually call its .close() method when finished, e.g.

w = MyProtocolWriter(filename)
w.write...
w.close()

However, when using it in this form, the zero byte normally written to terminate the stream is not written at all. This causes an unexpected error when reading the stream later (either an early EOF, or unexpected call to read a different protocol step).

Model:

MyProtocol: !protocol
  sequence:
    xs: !stream
      items: int

Example:

from issue.binary import BinaryMyProtocolWriter, BinaryMyProtocolReader

w = BinaryMyProtocolWriter("test.bin")
w.write_xs(list(range(42)))
w.close()

r = BinaryMyProtocolReader("test.bin")
xs = r.read_xs()
assert len(list(xs)) == 42
r.close()

Run it:

Traceback (most recent call last):
  File "/workspaces/yardl/joe/issue-#137/python/test.py", line 9, in <module>
    assert len(list(xs)) == 42
               ^^^^^^^^
  File "/workspaces/yardl/joe/issue-#137/python/issue/protocols.py", line 118, in _wrap_iterable
    yield from iterable
  File "/workspaces/yardl/joe/issue-#137/python/issue/_binary.py", line 971, in read
    while (i := stream.read_unsigned_varint()) > 0:
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspaces/yardl/joe/issue-#137/python/issue/_binary.py", line 228, in read_unsigned_varint
    self._fill_buffer(1)
  File "/workspaces/yardl/joe/issue-#137/python/issue/_binary.py", line 299, in _fill_buffer
    raise EOFError("Unexpected EOF")
EOFError: Unexpected EOF