dfinity / motoko-base

The Motoko base library
Apache License 2.0
479 stars 97 forks source link

Float.toText only displays 6 fractional digits - we probably want a representation that would roundtrip as a literal. #88

Open crusso opened 4 years ago

crusso commented 4 years ago

see rts/float.c

crusso commented 4 years ago

Using specifier "%.16g" not "%f" yields better result, but I don't know if that's good enough.

rts/float.c

export as_ptr float_fmt(double a) {
  extern int snprintf(char *__restrict, size_t, const char *__restrict, ...);
  char buf[50]; // corresponds to 150 bits of pure floatness, room for 64 bits needed
  //  const int chars = snprintf(buf, sizeof buf, "%f", a);
  const int chars = snprintf(buf, sizeof buf, "%.17g", a);
  return text_of_ptr_size(buf, chars);
}
crusso commented 4 years ago

@rossberg any advice here?

rossberg commented 4 years ago

Yes, %.17g is enough to roundtrip, I've used that in the Wasm interpreter, and it works for all the many corner cases in the test suite that exercise bit-precise rounding behaviour (at least OCaml's implementation of the formatting, which I assume is inherited from C).

However, in practice, users may not want to see umteen zeroes all the time. We'll have to add some richer formatting functions; toText should probably default to %f.

There's also %h, of course.

ggreif commented 4 years ago

Something like toText(f : Float, prec : ?Nat8) would be easily realisable (in the compiler, for the interpreter probably hairier).

rossberg commented 4 years ago

@ggreif, I think it should be a separate function from toText, though. But then we'd probably want s.th like SML's realfmt type, maybe extended with a hex case for %h.

In the interpreter that ought to be trivial (using OCaml's Printf).

ggreif commented 4 years ago

@rossberg In OCaml we can't build the format string dynamically, can we?

val fmt : string = "%.9g"
# Printf.sprintf fmt 3.14159;;
Error: This expression has type string but an expression was expected of type
         ('a -> 'b, unit, string) format =
           ('a -> 'b, unit, string, string, string, string)
           CamlinternalFormatBasics.format6
rossberg commented 4 years ago

Following SML's basis library, here is what I propose: a function Float.toFormattedText : (Float, Format) -> Text, where the type Float.Format is defined as follows:

type Format = {
  #fix : Nat;  // like "%.*f"
  #exp : Nat;  // like "%.*e"
  #gen : Nat;  // like "%.*g"
  #hex : Nat;  // like "%.*h" a.k.a. "%.*a"
  #exact;  // like "%.17g"
}

(Making precision non-optional, since that doesn't buy you much in this form of description.)

rossberg commented 4 years ago

@ggreif, no, but a simple switch over the 4 or 5 formats is enough. The precision can be turned into a dynamic argument by using *.

ggreif commented 4 years ago

That's what I mean by "hairier".

rossberg commented 4 years ago

How is that "hairy"?

ggreif commented 4 years ago

It has more hair :-)

ggreif commented 4 years ago

But we'll still need some supporting primitives for this to fly, right? (I'll come up with a PR in dfinity/motoko soon.)

rossberg commented 4 years ago

Perhaps just extend float_fmt with a mode and a precision arg?

crusso commented 4 years ago

Great, thanks guys. I'll leave that in your hands then @ggreif.

ggreif commented 4 years ago

I totally misunderstood what "%.*f" means. I thought that the * is purely meta-notation. Another thing learned...

rossberg commented 4 years ago

Now I see why you considered it hairy. :)

ggreif commented 4 years ago

Now I see why you considered it hairy. :)

Precisely.

ehoogerbeets commented 3 years ago

Is there any way we can not have toFormattedText at all and make it very very clear in the documentation that a simple toText is not for displaying to users? Formatting floating point numbers for human consumption is a locale-dependent function, as there are many different ways that different cultures format numbers with varying thousands separators, digit groupings, decimal separators, rounding rules, etc. A separate number formatter class that takes a locale would be ideal and would keep floats and natural numbers clean and small.