processing / p5.js

p5.js is a client-side JS platform that empowers artists, designers, students, and anyone to learn to code and express themselves creatively on the web. It is based on the core principles of Processing. http://twitter.com/p5xjs —
http://p5js.org/
GNU Lesser General Public License v2.1
21.45k stars 3.29k forks source link

nf() produces problematic string-formatting of very large or small numbers #5710

Open golanlevin opened 2 years ago

golanlevin commented 2 years ago

Most appropriate sub-area of p5.js?

p5.js version

1.4.1

Web browser and version

Google Chrome 103.0.5060.53 (Official Build) (arm64)

Operating System

MacOS 12.4

Steps to reproduce this

Steps:

  1. Extremely large or small numbers, when formatted with nf(), lose all exponent information. This could be misleading, confusing, or problematic.
  2. For example, I was displaying a variable x which I knew was between 0 and 1, using nf(x,1,2);. Unbeknownst to me, the number was something like 0.000000000001234, for which I would have expected to see something on the screen like 0.00 (because of rounding) — but instead I saw 1.23, which was very confusing. I was concerned that my variable had somehow gone outside the range of 0-1!

Snippet:

Here's some code:


  var smallNum = 1.0 / 700000000.0;
  print("A small number: " + smallNum);
  print("A confusingly-formatted small number: " + nf(smallNum,1,2));

  var bigNum = pow(5.0, 123.456);
  print("A big number: " + bigNum); 
  print("A confusingly-formatted big number: " + nf(bigNum,1,2));

This produces the following output:

A small number: 1.4285714285714286e-9 
A confusingly-formatted small number: 1.42 
A big number: 1.9590289562203644e+86 
A confusingly-formatted big number: 1.95 

Recommendation:

Some possible fixes are:

In the examples above, I feel the desired result should be the strings 1.42e-9 and 1.95e+86.

Qianqianye commented 2 years ago

Thanks @golanlevin! Adding the Math and Utilities Stewards @limzykenneth, @jeffawang, @AdilRabbani, and @kungfuchicken to take a look at this issue.

jeffawang commented 2 years ago

@golanlevin, you said you'd "...expected to see something on the screen like 0.00...," and this would have been my expectation too. My opinion is that changing nf() to behave that way would be a good fix.

For small numbers, when we use nf() and specify a number of digits to the right, we already opt in to rounding and loss of precision. Switching to a different behavior at some browser-chosen point seems confusing, and if the user wanted the output of nf() to fit within a specific container or number of characters, then that could break.

Here's an example where we show that we've already had a loss of precision via rounding for exponents -3 to -6.

  for (let i = 0; i < 10; i++) {
    const n = pow(10, -i);
    console.log(i, nf(n, 1, 2));
  }
0 "1.00" 
1 "0.10" 
2 "0.01" 
3 "0.00" 
4 "0.00" 
5 "0.00" 
6 "0.00" 
7 "1e-7.00" 
8 "1e-8.00" 
9 "1e-9.00" 

Very large numbers might warrant a separate discussion. My browser switches to scientific notation at 1e+21.00, which is quite large. That number is larger than Number.MAX_SAFE_INTEGER, which means that precision has been lost already due to the limitations of floats.

I think the current behavior of just showing the scientific notation without the exponent should certainly change though. Maybe it's okay in this case to show the exponent part, since the number is already at least 21 characters long.

@golanlevin, beyond fixing nf(), you'd also mentioned adding or changing other functions. Thanks for coming up with some solutions there! I don't have a fully formed opinion on it yet. On the one hand, it could be nice to have a function that can format with exponents. On the other hand, we already have multiple non-composable functions, and trying to support everything seems like it would result in a combinatorial mess of functions. Either way, I believe that fixing nf first is a higher priority.

jeffawang commented 2 years ago

@Qianqianye I'd like to add the discussion label to this issue, since it seems to have some judgment calls in the decision-making. I don't have permission to do it though.

Qianqianye commented 2 years ago

Thanks @jeffawang. Just added the discussion label. We are working on optimizing the labeling process, and we will make sure the stewards can add labels in the future.

golanlevin commented 2 years ago

Thanks @jeffawang.

The behavior of the nf() function is stranger than I expected. Here's some code, which takes negative powers of TWO_PI:

function setup() {
  for (let i = 0; i < 12; i++) {
    var n = pow(TWO_PI, -i);
    console.log(i +"\t" + nf(n, 1, 5));
  }
}

...and here's what it prints out. Check out the unexpected behavior in line 8!

0   1.00000 
1   0.15915 
2   0.02533 
3   0.00403 
4   0.00064 
5   0.00010 
6   0.00001 
7   0.00000 
8   4.11681 
9   6.55211 
10  1.04280 
11  1.65966 
jeffawang commented 2 years ago

Ah yes, I think that is similar behavior in that the change happens when javascript changes to scientific notation. I believe mine had the e in it because there was no decimal point in my whole numbers, but yours does have a decimal point. I think it's reasonable to expect line 8 and beyond in your example to be 0.00000.

limzykenneth commented 2 years ago

I found this issue from long ago #1879 where the lack of support for scientific notation was noted. The nf series of functions had been the source of many problems before and incidentally also one of my least used functions, mainly because I don't know what I would use it for.

I think clear expectation of what these functions should do that go along with clear examples can help us understand how to implement and fix them for the long term.

golanlevin commented 2 years ago

mainly because I don't know what I would use it for

Hi @limzykenneth, I am very glad to discuss the nf() function — it is one of my favorite functions, and I use it routinely in my work. The purpose of my reply here is to demonstrate three different things I use it for, to help illustrate why these functions deserve attention.

1. Debugging!

For me, nf() is a crucial debugging tool. When creating real-time interactive art, I frequently print numbers to the console for debugging purposes. Printing these numbers without nf() produces distracting formatting issues (such as varying indentation) that can make it very difficult to compare and assess trends properly. Here's a simple example: a program that prints the current time and a random number:

function setup() {
  createCanvas(400, 400);
  randomSeed(5);
}

function draw() {
  var m = millis();
  var x = random(width);
  print(m + ", "  + x); 
}

This produces the following output at the console:

console

A simple change makes it much easier to 'read' the debugging results:

function setup() {
  createCanvas(400, 400);
  randomSeed(5);
}

function draw() {
  var m = millis();
  var x = random(width);
  print(nf(m,5,1) + ", "  + nf(x,3,2)); 
}

This produces:

console_2

Not only do the numbers line up nicely, but they now also eliminate irrelevant digits that distract from showing trends in the data. (In this sense, repairing nf() to properly handle numbers with exponentials could even be considered an accessibility or ergonomics issue.)

2. Sensible user-facing numeric displays and documents

The nf() function formats numbers in ways that make sense for user-facing displays and computationally-generated documents. Consider the following example:

var bankBalance2021 = 123.45;
var interestRate = 2.25; // Current US federal interest rate 
var bankBalance2022 = bankBalance2021 * (1.0 + interestRate/100.0);
print ("Unformatted: $" + bankBalance2022); 
print ("Bank balance: $" + nf(bankBalance2022, 1,2)); 

This produces the following results in the console — it's clear which one would be preferable to show a user/reader:

Unformatted: $126.227625 
Bank balance: $126.22 

3. Generating output files with a small memory footprint

I do a lot of generative artwork with computer-controlled plotters, which take SVG vector graphics. When computationally generating SVG files with thousands or millions of vertices, it can be desirable for the resulting files to have a small memory footprint, by reducing the precision of numbers. It simply isn't necessary (or even technically possible) to physically execute vector designs that are accurate to a trillionth of an inch!

Here's a short program fragment that generates an SVG path connecting 4 points:

print("<path fill=\"none\" stroke=\"red\" stroke-width=\"1px\" d=\""); 
beginShape();
for (var i=0; i<4; i++){
  var x = random(width); 
  var y = random(height); 
  vertex(x,y);

  if (i==0){
    print("\tM" + x + " " + y); // moveto
  } else {
    print("\tL" + x + " " + y); // lineto
  }
}
endShape(); 
print("Z\" />");

This produces the following output (which would be saved to a text/SVG file):

<path fill="none" stroke="red" stroke-width="1px" d=" 
    M94.73723107948899 183.98977555334568 
    L75.58012185618281 296.7598518356681 
    L286.80395456030965 46.89168855547905 
    L82.31999790295959 188.9366129413247 
Z" /> 

SVG units are 1/96th of an inch, so it's clear that there's an enormous amount of unnecessary precision here, and it bloats the file. Instead, using nf() to provide just 2 decimal values — which is still accurate to ~1/10,000th of an inch — reduces the file size immensely. This also makes a practical difference in how much time is required for the plotter drivers to process the file.

limzykenneth commented 1 year ago

Circling back to this after a couple of month, I'm glad to say I've found a use for nf() in one of my projects and also have the problem described above. After looking into it there seem to be a few different scenarios to address. The main source of the problem is that nf() uses toString() on the number directly and with exponent values that notation will be preserved as well.

nf() with very small number

If the expectation is for something like 1.23e-20 to output 0.00 from nf() we can use Number.toFixed(). It takes care of converting exponential notation to full decimal totation reasonably well and we don't have to worry about parsing the number completely manually as we do now in nf().

nf() with very large number

Number.toFixed() also work with large numbers until the exponent becomes +21 in which case it returns the number with exponent as well. However by this point the value will already be larger than Number.MAX_SAFE_INTEGER (which is only about 9e+15) and normal behaviour of Numbers may not apply anymore. We can possibly check the number being passed in to be smaller than MAX_SAFE_INTEGER and print a FES message for the problem and maybe suggest using BigInt.

Getting nf() values for very small and very large numbers but include exponent info

This we can probably go with @golanlevin's suggestion of implementing nfe() so some degree of accuracy can be preserved for these numbers.

aditya-shrivastavv commented 1 year ago

please assign this to me, i think i can take this forward

aditya-shrivastavv commented 1 year ago

Is their something remaining I can help with to close this issue ?

limzykenneth commented 1 year ago

There is only the question of nfe() which would be a new feature, as such I would like to leave the issue open to get more comments from the community first before actually moving to implementation.

Qianqianye commented 11 months ago

Thank you all for working on this issue. I am inviting the current Math stewards to this discussion @limzykenneth, @ericnlchen, @ChihYungChang, @bsubbaraman, @albertomancia, @JazerUCSB, @tedkmburu, @perminder-17, @Obi-Engine10, @jeanetteandrews. Would love to hear what y'all think. Thanks!