bblfsh / sdk

Babelfish driver SDK
GNU General Public License v3.0
23 stars 27 forks source link

Support persistent UAST hashes #410

Open dennwc opened 5 years ago

dennwc commented 5 years ago

SDK currently defines UAST hash to be opaque: clients cannot recompute it and the hash might change between SDK versions. This presents challenges for persisting those hash values.

We may need to define a hash that is stable between versions and well-defined. It may be a good idea to add (stable) hashing support to the binary UAST serialization.

creachadair commented 5 years ago

To avoid the issue with persistent data, I think there are two possible solutions:

  1. Keep it opaque, but fix an algorithm and promise not to change it, or
  2. Include a marker in the data that can be used to version it.

To implement (2), for example, we could make the first byte of the hash an algorithm tag (either by clobbering the first byte of the hash value, or just extending the array by 1 byte).

I don't really see much point in changing the algorithm, since we want it as a fingerprint rather than for security purposes, but either approach could work.