mpusz / mp-units

The quantities and units library for C++
https://mpusz.github.io/mp-units/
MIT License
1.09k stars 86 forks source link

Rounding error in double rounding when converting nJ to J #608

Closed psychon closed 1 month ago

psychon commented 1 month ago

Hi there,

the following program....

#include <iostream>
#include <mp-units/systems/si.h>
#include <mp-units/ostream.h>

using namespace mp_units::si::unit_symbols;

int main(int argc, char **argv) {
    mp_units::quantity<mp_units::si::nano<mp_units::si::joule>> a = 15. * W * s;
    mp_units::quantity<mp_units::si::joule> b = a;
    auto difference = a - b;
    std::cout << ((a == b) ? "equal" : "not equal");
    std::cout << " with a difference of " << difference;
    std::cout << std::endl;
    return 0;
}

...outputs with mp-units 2.2.0 (see on godbolt)%3B%0A++++std::cout+%3C%3C+%22+with+a+difference+of+%22+%3C%3C+difference%3B%0A++++std::cout+%3C%3C+std::endl%3B%0A++++return+0%3B%0A%7D'),l:'5',n:'1',o:'C%2B%2B+source+%231',t:'0')),k:48.72062663185378,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:executor,i:(argsPanelShown:'1',compilationPanelShown:'0',compiler:clang1810,compilerName:'',compilerOutShown:'0',execArgs:'',execStdin:'',fontScale:14,fontUsePx:'0',j:1,lang:c%2B%2B,libs:!((name:mp-units,ver:'220')),options:'-std%3Dc%2B%2B23',overrides:!(),runtimeTools:!(),source:1,stdinPanelShown:'1',wrap:'1'),l:'5',n:'0',o:'Executor+x86-64+clang+18.1.0+(C%2B%2B,+Editor+%231)',t:'0')),k:51.27937336814621,l:'4',n:'0',o:'',s:0,t:'0')),l:'2',n:'0',o:'',t:'0')),version:4):

not equal with a difference of -1.90735e-06 nJ

The same program with mp-units 2.1.0 outputs zero (you have to change the second include from mp-units/systems/si.h to mp-units/systems/si/si.h) (see on godbolt)%3B%0A++++std::cout+%3C%3C+%22+with+a+difference+of+%22+%3C%3C+difference%3B%0A++++std::cout+%3C%3C+std::endl%3B%0A++++return+0%3B%0A%7D'),l:'5',n:'1',o:'C%2B%2B+source+%231',t:'0')),k:48.72062663185378,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:executor,i:(argsPanelShown:'1',compilationPanelShown:'0',compiler:clang1810,compilerName:'',compilerOutShown:'0',execArgs:'',execStdin:'',fontScale:14,fontUsePx:'0',j:1,lang:c%2B%2B,libs:!((name:mp-units,ver:'210')),options:'-std%3Dc%2B%2B23',overrides:!(),runtimeTools:!(),source:1,stdinPanelShown:'1',wrap:'1'),l:'5',n:'0',o:'Executor+x86-64+clang+18.1.0+(C%2B%2B,+Editor+%231)',t:'0')),k:51.27937336814621,l:'4',n:'0',o:'',s:0,t:'0')),l:'2',n:'0',o:'',t:'0')),version:4):

equal with a difference of 0 nJ

A git bisect pointed to https://github.com/mpusz/mp-units/commit/840b0bd54e05bec9423fbdb539cac2eda13284c4 and specifically the if constexpr (std::is_floating_point_v<multiplier_type>) in there. Turning that into if constexpr (false) gets rid of the rounding issue (because it effectively reverts the commit).

Thus, this is down to floating point math not being associative, i.e. value * val(num) / val(den) * val(irr) vs value * (val(num) / val(den) * val(irr)). The pre-computed constant introduces a tiny rounding difference.

Now, my first question is: Am I just "holding it wrong" or am I right in assuming that there should be no rounding error here?

My second question would be what to do about this. A random idea (if you want to keep this optimisation to a single floating point op) would be something like if (num > den) result = value * (num / den); else result = value / (den / num);, i.e. generating a division instead of a multiplication when the ratio is smaller than one. In my specific case: Instead of multiplying with 1e-9, the code would divide by 1e9. If this seems worthwhile, I might give this idea a proper try, but first I want to give you a chance to say "nah".

Edit: P.S.: Another idea might be to introduce a compile-time option for this. Something like "fast" vs "exact".

chiphogg commented 1 month ago

You may find this discussion page interesting. It goes into the considerations for how our units library, Au, applies magnitudes to values. It depends on the nature of the magnitude (integer, inverse-integer, rational, or irrational), as well as the properties of the type (primarily, integral vs. floating point). Here are the two points that seem most relevant to me:

  1. Users of floating point rep should expect what we commonly call the "usual floating point error", typically a few "units of least precision" (ULPs). It's generally not worth trying to improve a computation that already meets this bar, unless there really aren't any tradeoffs associated.
  2. For the specific category of inverse-integer magnitudes (such as this one!), there is a no-tradeoff approach: we can just divide by the inverse magnitude. This will improve accuracy without compromising speed. Really, the only downside is slightly more complex code, but if constexpr should make even that cost very small.

See also #599, which seems very related.

mpusz commented 1 month ago

@chiphogg, thanks for sharing your docs! This is a really nice description of the problem and possible solutions. I think we could benefit from something like this in our ISO paper as well. Do you want to add a chapter about it?

chiphogg commented 1 month ago

Chapter or section, yeah, I could do that. :+1:

psychon commented 1 month ago

Thank you for the quick reaction and the fix!