basvandijk / scientific

Arbitrary-precision floating-point numbers represented using scientific notation
BSD 3-Clause "New" or "Revised" License
73 stars 40 forks source link

base10Exponent does not respect equality. #63

Closed ksvanhorn closed 6 years ago

ksvanhorn commented 6 years ago

Two objects that are == should be functionally indistinguishable -- that's what equality MEANS. Yet Scientific values do not obey this rule:

ghci> x = scientific 100 3
ghci> y = scientific 1 5
ghci> x == y
True
ghci> base10Exponent x
3
ghci> base10Exponent y
5
basvandijk commented 6 years ago

But if x /= y then either x < y or x > y which both don't make sense.

For this reason equality of Scientifics is defined on the numeric value of the Scientific and not on the textual representation.

ksvanhorn commented 6 years ago

The problem isn't == itself; the problem is base10Exponent. Either base10Exponent and coefficient need to first normalize their argument, so that they always returns == values for == inputs, or all Scientific values need to be kept in normalized form.

ksvanhorn commented 6 years ago

To explain further: the users of this package shouldn't have to worry about the internal representation you use. That should be abstracted away.

Hmm, come to think of it, the user really wants to think of these numbers as having the form x * 10^e, where 1 <= |x| < 2. Perhaps base10Exponent and coefficient should be replaced in the public interface by a function or pair of functions returning x and e. (BTW, the usual term for x would be the mantissa.)

basvandijk commented 6 years ago

I see your point and I'll agree it would be better if coefficient and base10Exponent return normalized values.

I don't want to normalize Scientific values at construction because it's not always needed; One of the most common uses of scientific is converting Scientific to Doubles (i.e. toRealFloat) which doesn't require normalization.

Although I have to admit I don't know how expensive normalization really is. It's probably a good idea to do a benchmark to see the difference.

basvandijk commented 6 years ago

Another idea I have is to normalize Scientifics on construction by default but also provide the unsafe smart constructor:

unsafeDoNotNormalizeScientific :: Integer -> Int -> Scientific
unsafeDoNotNormalizeScientific = Scientific

which can be used when you know the input has been normalized already.

Then we can change the parsers (scientificP and the one in attoparsec) to handle trailing '0' digits directly which is probably cheaper than normalizing the resulting Scientific.

phadej commented 3 years ago

i'll prepare a release soon but leave this out for now. I feel this is too big change to sneak in into minor version. I'm also not 100% sure normalisation on construction is good for cases where scientific is used.

I have to think if moving base10exponent and coefficient into Unsafe module is less complicated (In current version we don't maintain any invariant of Scientific which makes things simpler to reason about).

Note that e.g. case-insensitive has original, unordered-containers has toList, these are IMO pragmatic violations of Eq substantivity. (EDIT: though I think they belong in .Unsafe module).