Float precision more in line with reality

FEniCS / ffcx

Next generation FEniCS Form Compiler for finite element forms

https://fenicsproject.org

Other

149 stars 40 forks source link

Float precision more in line with reality #608

Closed chrisrichardson closed 1 year ago

chrisrichardson commented 1 year ago

May be a fix for issue #607 @mscroggs ?

chrisrichardson commented 1 year ago

np.finfo(np.float32).precision gives 6, but 8 seems to be a better number to use when formatting output. We have been using 16 for float64. Here I have put a lower bound of 8. An alternative approach would be to add 2 instead of 1 to the finfo precision.

garth-wells commented 1 year ago

Maybe we should use a hexadecimal string? Could put the decimal version in a string comment.

chrisrichardson commented 1 year ago

Maybe we should ask whether we need a precision option at all? If we just always write out 17 digits, that is fine for both float and double. For long double we would need more, but then the tables themselves in numpy need to be more precise too.

michalhabera commented 1 year ago

Maybe we should ask whether we need a precision option at all? If we just always write out 17 digits, that is fine for both float and double. For long double we would need more, but then the tables themselves in numpy need to be more precise too.

Agree. This precision option is a bit useless. You'd go for lower precision for smaller memory, but that is controlled with other tables dtype anyway, so providing artificially truncated tables into double precision array is inefficient. In the other direction, C will automatically downcast.

garth-wells commented 1 year ago

Maybe we should ask whether we need a precision option at all? If we just always write out 17 digits, that is fine for both float and double. For long double we would need more, but then the tables themselves in numpy need to be more precise too.

Agree. This precision option is a bit useless. You'd go for lower precision for smaller memory, but that is controlled with other tables dtype anyway, so providing artificially truncated tables into double precision array is inefficient. In the other direction, C will automatically downcast.

See https://github.com/FEniCS/ffcx/pull/608#issuecomment-1711095857 - just write hexadecimal. Minimal file size, no approximation.

chrisrichardson commented 1 year ago

How does this look with casting from hex int to float/double in C? I guess it is less portable, endian-ness and IEEE etc.

garth-wells commented 1 year ago

How does this look with casting from hex int to float/double in C? I guess it is less portable, endian-ness and IEEE etc.

Hex floats are in the C99 and C++17 standards:

garth-wells commented 1 year ago

In [1]: x = 10.1

In [2]: x.hex()
Out[2]: '0x1.4333333333333p+3'

michalhabera commented 1 year ago

Sure hex works, but why would we do it? We only need to print the numbers with the precision we know them, does not make much sense to print number exactly with many digits being just base-2 floating point artefacts. According to https://docs.python.org/3/tutorial/floatingpoint.html repr() or str() should work. Or maybe better, find precision of number being printed (Python's float, i.e. float64 or Numpy's dtype) and print it as f"x:.{p}" (or p+1).

garth-wells commented 1 year ago

Sure hex works, but why would we do it? We only need to print the numbers with the precision we know them, does not make much sense to print number exactly with many digits being just base-2 floating point artefacts. According to https://docs.python.org/3/tutorial/floatingpoint.html repr() or str() should work. Or maybe better, find precision of number being printed (Python's float, i.e. float64 or Numpy's dtype) and print it as f"x:.{p}" (or p+1).

Reasons for doing it are that there would be no debate over the number of decimal digits, and we wouldn't have things like 1.00000000000000. Also, the generated code would be consistent with the corresponding Basix elements.