Closed filipecosta90 closed 3 years ago
Thanks for the PR, I'll try to spend time on this sometime this week, I will try to port some of the test cases I have in python version that create a histogram from scratch and verify that the encoded histogram are similar. Once the encoding is verified we can have a closer look at the decoding side, Clearly decode+encode should be idempotent.
We should probably only support v2 format, don't think anybody is using V1 any more.
Looks good. The reason the recode of a decoded histoblob coming from another implementation is different is due to the difference in compression library implementation. As long as they can decode properly, it should be fine.
This is a WIP PR to share the current status of compressed histogram v2 encoding and decoding. wanted to pick your brain @ahothan on where I might be failing on the porting.
current to be added APIs
func Decode(encoded []byte) (rh *Histogram, err error)
- Decode returns a new Histogram by decoding it from a String containing a base64 encoded compressed histogram representation.func (h *Histogram) Encode(version int32) (buffer []byte, err error)
- Encode returns a snapshot view of the Histogram. The snapshot is compact binary representations of the state of the histogram. They are intended to be used for archival or transmission to other systems for further analysis.quick notes
Please notice that:
current issue
The current issue to be discussed is that taking as examples the encoded histograms from JS, python,etc... we're able to decode them, but if we encode them again the output based64 is different from the original one. Here is a good example of it: