Open dolio opened 1 year ago
They do indeed have different hashes:
2 | > blake2b_256 +1
⧩
0xs2dcf5ad733d105f557d6280a2f8202893f8219f6e2a88d06d71e2b7d35887adf
3 | > blake2b_256 1
⧩
0xs8f141eba4d9e62720169e2611ed21dcb8d03976f133ecaa66503794442a0f0c0
The Haskell runtime represents
Nat
andInt
values as pseudo data types, with constructors that contain unboxed machine integers, but refer to builtinReference
values, unlike real data types which always contain hashes. This decision has influenced the serialization and hashing of runtime values, which (I believe) distinguish between these two sorts of values.However, in the scheme implementation, it's significantly more efficient to just represent these values as just scheme numbers. This leaves it unclear which values are supposed to be positive
Int
s vs.Nat
s, however (obviously negativeInt
s can be figured out). Many aspects don't really care about keeping track of the distinction between these sorts of values, as well. For instance,Nat
andInt
built-ins don't really need to check whether their arguments are represented in exactly the expected way, because they're just operating on the unboxed data.So, it seems like the problem that the scheme representation has is just that decisions have been made based on the particular way the Haskell interpreter represents things. And there doesn't really seem to be a fundamental problem with, for instance, an
Int
value having the same hash as the correspondingNat
value. It just doesn't right now, and the scheme implementation would need to meet that specification.So, instead, we should probably revise the specification of hashing/serialization so that it does not mandate exactly the representation that the Haskell interpreter uses, because otherwise other runtimes must carry the same type information, which can have significant costs (at least, absent optimizations like worker/wrapper that allow for locally omitting the information).