Closed joseluis closed 1 year ago
// FIXME: but it seems displaced -15, +15 integral units.
I think this is the difference between 0.000000000000000×10^−383
and 0000000000000000×10^−398
, and 9.999999999999999×10^384
and 9999999999999999×10^369
. Since the decimal encodes an integer, the actual range of the exponent encoded is reduced by the number of fractional digits. You'll find most functions in this library refer to the _integerexponent, which is this shifted value.
dec::Decimal64::from_str("10e-383").unwrap()
This example encodes fine as 1.0e-382
, but this one:
dec::Decimal64::from_str("10e384").unwrap()
becomes infinity.
I'm mostly puzzled about the differences with the dec library... our binary representation varies wildly from theirs. Doesn't that mean there's something that needs to be fixed or at least better understood / documented?
Also, do you know what the constants MIN_10_EXP, MAX_10_EXP, MIN_EXP, MAX_EXP should be set to in our case then?
Hmm, yeh on little-endian architectures the binary representation of this library and dec
should agree, unless dec
is rounding, which this library doesn't do. I think dec
encodes in arrays of 32bit integers instead of bytes, but as far as I know it doesn't actually specify any particular binary encoding.
Have you got any examples handy of where they don't agree? From the ones above the only differences I can see are where dec
is rounding to infinity (which is the all-zeros except the last byte is 120
cases).
Also, do you know what the constants MIN_10_EXP, MAX_10_EXP, MIN_EXP, MAX_EXP should be set to in our case then?
The max exponent you can write in a string depends on the number of fractional digits, so I'm not sure we can actually define these as useful constants. The minimum and maximum values that can actually be encoded in a decimal of a given precision are:
MIN_EXP
: 1 - emax - precision_digits + 1
MAX_EXP
: emax - precision_digits + 1
But since the exponent is adjusted based on fractional digits, if you have a decimal point then the exponent you see can be larger, by up to precision_digits - 1
.
Say we're encoding into a decimal64
. The following values are equivalent:
10e369
1.0e370
but this will overflow:
10e370
I'm not much of an expert on this stuff either, so may be entirely mistaken here, but that's my understanding of it all 🙂
Yes I think that's why MIN_10_EXP and MAX_10_EXP are defined as the minimum and maximum normalized exponents base 10, so currently for Bitstring64 they should be set to 10e-398, and 10e369, respectively.
I don't think the binary exponents make sense in the case of decimal numbers we can leave them out.
What I was mostly wondering is whether the library ideally should support the documented exponent range of 10e-383 (or 10e-382) & 10e384 (or close enough)... But if that's too complicated or too much work I can just create this constants set to the current limits. What do you think?
I think we should and do support the range -383 to 384 when encoding using scientific notation in the same way as dec
, so I don't think there's anything we actually need to change in the source. Perhaps we set MIN_10_EXP
and MAX_10_EXP
to the normalized range, so -398 to +369, and in our library docs note that the exponent range changes when the number is fractional. Basically the same way Wikipedia describes it. I'd be happy to take a crack at writing that.
What do you think?
Ah, I just caught up on what MIN_10_EXP
and MAX_10_EXP
are 🙂 They're the base10 exponents, not the exponent range for N
in 10eN
.
I think we could simply ignore them?
We could ignore them but aren't they equivalent to the ones for f32/f64? They can be useful to make sure while working with exponents they are within the supported range, or to clamp them...
I'm still learning about decimals and floating points so not I'm not feeling super confident about these topics, but for me and by default it makes most sense to mimic what the rust library is doing.
It took me a while but I'm finally grokking how both complementary exponent ranges represent the same range of numbers, but using a different significand.
I think we should set them to -398 to +369 because these values are consistent with the behavior of the actual implementation of the "from_str" function, which is where they'd be used.
Have you got any examples handy of where they don't agree? From the ones above the only differences I can see are where
dec
is rounding to infinity (which is the all-zeros except the last byte is120
cases).
You are right, I've made more checks and they agree except when rounding to infinity, so there's no issue.
There seems to be a displacement between the accepted min max exponents in the spec and the ones implemented.
I've made an example to showcase it and compare it with dec:
its output: