NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.12k stars 1.47k forks source link

toJSON truncates floats #5733

Open pennae opened 2 years ago

pennae commented 2 years ago

Describe the bug

toJSON unnecessarily truncates floats to six significant figures.

Steps To Reproduce

nix-repl> builtins.toJSON 1.000001
"1"

nix-repl> builtins.toJSON 1048576.1
"1.04858e+06"

Expected behavior

nix-repl> builtins.toJSON 1.000001
"1.000001"

nix-repl> builtins.toJSON 1048576.1
"1048576.1"

nix-env --version output

tested 2.4 and 2.5pre20211126_55275fc

Additional context

found while investigating https://github.com/NixOS/nixpkgs/issues/148824

mopleen commented 2 years ago

This is the place https://github.com/NixOS/nix/blob/master/src/libutil/json.cc#L34 Could be fixed like this str << std::setprecision(9) << n; but maybe it's better to use fmt or boost formatting to output floating point numbers 🤔

pennae commented 2 years ago

this also affect toString in a different way. toString always produces six fractional digits, rounding small floats like 1e-9 to 0.

flokli commented 1 year ago

This can't be changed, because it would change hashes across Nix versions.

srhb commented 1 year ago

According to my bisect, this was "fixed" in https://github.com/NixOS/nix/pull/7313 despite the hash changes. We had some scary math issues as a result.

roberth commented 1 year ago

This can't be changed, because it would change hashes across Nix versions.

I'd like a more configurable toJSON2 primop that allows the precision to be extended, or optionally disables the collapsing of "outPath" fields.

flokli commented 1 year ago

Displaying floats in decimal is apparently pretty complicated science, and there's multiple competing algorithms to come up with the representation, and even more ways to format it.

It's not entirely well-defined how Nix does it either. If you're curious, check the commit message of https://cs.tvl.fyi/depot/-/commit/1facd889bba724cf20ea14422ee1e57440b3e761, which describes the challenges while trying to mimic Nix as much as possible.

I personally think supporting floats to be printed as strings (and by this, allowing that representation to be leaked into output hashing) was a mistake.

I don't think we should add more builtins treating this differently, or allowing to configure the behavior, but rather discourage the usage of floats in output alltogether. We can't remove it from the language, (even though nixpkgs uses it very little), but we at least could print warnings whenever someone converts a float to string, so they know they produce output that might differ across implementations / architectures / compiler versions.

roberth commented 1 year ago

I agree. Until we explicitly formalize and test the semantics of non-integral numbers in the Nix language, the current implementation ("whatever some platform code does") should be avoided. Such a project should also include the removal of native floating point operations

Nixpkgs/NixOS could sidestep the Nix floats by postprocessing something like mkFloat = value: { _type = "float"; inherit value;} etc. It already postprocesses the toJSON output in a derivation (in pkgs.formats), so that seems within reason. That also solves the precision limitations, as long as users don't need to compute with floats.

andersk commented 1 year ago

Some prior art for deterministically specifying float ↔ string conversions can be found in the ECMAScript specification: Number::toString, StringNumericValue. For historical reasons, that specification has two points where an implementation-defined choice can be made:

The Ryū, Grisu-Exact, and Dragonbox algorithms all satisfy this stricter deterministic specification; their difference is in performance and not observable rounding behavior.