databento / dbn

Databento Binary Encoding (DBN) - Fast message encoding and storage format for market data
https://databento.com
Apache License 2.0
86 stars 8 forks source link

sync encoder does not encode using the symbol_cstr_len for DBN v2 (?) #40

Closed Hailios closed 9 months ago

Hailios commented 10 months ago

Bug Report

Expected Behavior

metadata encoder to use the metadata.symbol_cstr_len as length for the fixed length strings.

Actual Behavior

the metadata encoder picks a fixed length for the strings based on version, and ignores the symbol_cstr_len for this purpose, it does serialize symbol_cstr_len tho. to me it looks like if the user sets symbol_cstr_len to something which would be a optimal/good size for their strings, then the decoding would fail, because the decoder actually reads symbol_cstr_len and assumes that the strings are of that size.

Steps to Reproduce the Problem

  1. look here https://github.com/databento/dbn/blob/main/rust/dbn/src/encode/dbn/sync.rs#L255 to see that the size is fixed even for V2
  2. for decoding, one should note that the string size is first decoded, and then used to the code the strings https://github.com/databento/dbn/blob/main/rust/dbn/src/decode/dbn/sync.rs#L475

Specifications

threecgreen commented 10 months ago

Thanks for the bug report. I see how this could be an issue for forward compatibility.

threecgreen commented 9 months ago

Fixed in aede39b73c9858eb95ce2fd678ef0378ec38bc9e