ndsev / zserio

zero sugar, zero fat, zero serialization overhead
https://zserio.org/
BSD 3-Clause "New" or "Revised" License
105 stars 27 forks source link

Implement packed array diagnostics in zserio #475

Open MisterGC opened 1 year ago

MisterGC commented 1 year ago

Consider the following ideas to make the packed arrays feature better:

  1. Metrics or statistics to show compression efficiency: Provide information about the compression efficiency of individual objects. This would allow developers to make informed decisions about which objects should be packed and how.

  2. Option to allow disable array packaging, for example if it turns out that from a runtime performance and space efficiency point of view it makes more sense to apply a generic compressor (like gzip) to the entire blob.

  3. Option to selectively pack arrays: Allow users to choose which arrays should be packed based on size and compression efficiency. This would provide better control over serialization performance and make packed arrays more efficient.

After discussion it probably makes sense to split this issue up into multiple ones.

mikir commented 10 months ago

Currently the most important thing is to improve packed array diagnostics (option 1.). Options 2. and 3. would lead to different schema introducing binary incompatibility, therefore they are postponed.

One possible solution could be to implement special JSON export for packed arrays using reflections similarly to current JSON data export. This special export will contain for each packed array, the uncompressed size, the compressed size, the calculated delta and probably some more info.

Another solution could be to implement new special command line application which will inspect data and report diagnostics information about packed arrays on the screen.