Open zslayton opened 4 years ago
I get the impression that the issue of inconsistent valid encodings also exists for positive zero.
The spec for binary decimals gives
The subfield [coefficient] should not be present (that is, it has zero length) when the coefficient’s value is (positive) zero.
If the value is 0. (aka 0d0) then L is zero, there are no length or representation fields, and the entire value is encoded as the single byte 0x50.
Since the first is a should rather than a must guidance, writers are permitted to write positive zero with or without a coefficient
subfield.
0d0
. The approach adopted to resolve https://github.com/amzn/ion-docs/issues/93 suggests that perhaps this also was not intended to be read as a must guidance.The binary documentation makes this explicit later on in the Timestamps section, and even uses 0d0
as the example of a decimal value being valid with implicit or explicit zero coefficient.
The fraction_exponent and fraction_coefficient denote the fractional seconds of the timestamp as a decimal value... A missing coefficient defaults to zero... The following hex encoded timestamps are equivalent: ...
69 80 0F D0 81 81 80 80 80 80 // The same instant with 0d0 fractional seconds and implicit zero coefficient 6A 80 0F D0 81 81 80 80 80 80 00 // The same instant with 0d0 fractional seconds and explicit zero coefficient
The Ion-Hash decimal spec indicates that the coefficient
subfield is required to not be present for positive zero values.
The coefficient subfield shall not be present (that is, it has zero length) when the coefficient’s value is (positive) zero.
And requires the special compact encoding of 0d0
.
If the value is 0. (aka 0d0) there is no representation subfield.
While we can/should decide on a canonical format for our own libraries to emit, we should also clarify in the spec that writers MUST emit that same format if that's required to enable ion-hash.
Notably, changing to MUST guidance was not the approach taken for the substantially similar issue https://github.com/amzn/ion-docs/issues/93.
It does not seem here that changes to the Ion specification are necessary to enable ion-hash. For both positive and negative zero, the issue of inconsistent valid representations could be resolved by requiring conversion to a particular canonical form before hashing, in the same way that the ion-hash spec resolves the issue of padded UInt/Int/VarUInt/VarInt
fields.
Any UInt, Int, VarUInt, or VarInt fields are encoded in the manner defined in the Amazon Ion Binary Encoding specification with one difference: such fields shall encoded in minimal fashion – no padding is allowed.
the issue of inconsistent valid representations could be resolved by requiring conversion to a particular canonical form before hashing, in the same way that the ion-hash spec resolves the issue of padded UInt/Int/VarUInt/VarInt fields.
This sounds right to me.
Originally from: https://github.com/amzn/ion-js/issues/474
Surfacing this here since we haven't discussed it recently and are adding more implementations of
ion-hash
.While
ion-js
's output does differ fromion-java
's, both libraries are writing valid representations of equivalent Ion values. See this example from the binary format section of the Decimals spec:While we can/should decide on a canonical format for our own libraries to emit, we should also clarify in the spec that writers MUST emit that same format if that's required to enable
ion-hash
.In the meantime, I'll look into adding special cases for
ion-js
.Originally posted by @zslayton in https://github.com/amzn/ion-js/issues/474#issuecomment-546487916