microsoft / yardl

Tooling for streaming instrument data
https://microsoft.github.io/yardl/
MIT License
29 stars 5 forks source link

Extra \0 byte written in Python binary.StreamSerializer when value is empty #142

Closed naegelejd closed 3 months ago

naegelejd commented 3 months ago

If the user writes an empty iterable to a binary stream, the underlying StreamSerializer should not write a 0 byte. The 0 byte used to terminate a serialized stream is written elsewhere. Currently, yardl does this: https://github.com/microsoft/yardl/blob/7a0ab26b0a1050a2e9b2394486ea971d9e8a11d3/tooling/internal/python/static_files/_binary.py#L958-L959

How to reproduce:

Change the Simple protocol round trip test to write a mixture of empty and non-empty streams. It currently writes values to each stream in the first part of the test, then writes only "empty" iterables in the second part: https://github.com/microsoft/yardl/blob/7a0ab26b0a1050a2e9b2394486ea971d9e8a11d3/python/tests/test_protocol_roundtrip.py#L584-L608 Add the following validation:

    # mixed empty and non-empty streams
    with c() as w:
        w.write_int_data(range(0))
        w.write_optional_int_data([1, 2, None, 4, 5, None, 7, 8, 9, 10])
        w.write_record_with_optional_vector_data([])
        w.write_fixed_vector(([1, 2, 3] for _ in range(4)))

The test will fail.

Note: Adding this validation to test_simple_streams uncovered another unrelated bug in NDJsonProtocolReader._read_json_line. Separate issue.