Closed mpusz closed 1 month ago
@chiphogg, do you have some ideas on how to address that?
Sorry for my late responses! I have been a little swamped lately. I've tried to knock out the "easy stuff" (removing exp
; trivial doc updates; renaming as_magnitude
), but I've let the more complicated stuff slide (more detailed doc updates; this issue).
Here are some sundry thoughts and comments to get us started.
First: I find that in practice, I rarely see Magnitudes show up in compiler errors in au
. Almost every "unit" type is a strong type, whose name in compiler errors is identical to what the user sees in their code. (However, this could still happen if a user multiplies a Unit by a Magnitude without creating a new strong type.)
Your code example on godbolt was very helpful. I couldn't make a 1:1 comparison with our library, because we're still using C++17 (so we don't have concepts), but I think I did still identify a relevant difference between the libraries. Our prefixed units are strong type definitions, so the Magnitude is hidden behind that strong type. I wrote a test with this code to do a similar conversion (with the main difference being that I had to choose a specific unit for the area conversion---man, those concepts would be nice! :grin:):
(1 / kilo(meters)(50.0)).as(squared(meters));
Some points about the error message:
static_assert
with the message, "Can only convert same-dimension units"
. Using static_assert
gives the user a very direct indication of the problem. (That said, I'm not sure whether this strategy translates to the case of using a concept.)au::Pow<au::Kilo<au::Meters>, -1>
, and au::Pow<au::Meters, 2>
, i.e., no Magnitude
anywhere. The squared meters might be a red herring, since the mp-units case uses a concept and doesn't specify the target unit. But that also wasn't where the problem was: the relevant unit was inverse km, and we have Kilo<Meters>
instead of something with a Magnitude.tl;dr: the overwhelming source of power-of-10 magnitudes showing up in compiler errors is likely to be SI prefixes. Perhaps an alternative strategy is to somehow make prefixed units strong types which hide the magnitude? (Disclaimer: I don't know the fine details well enough to see whether this strategy could work.)
Second, is this example helped very much by a change-of-basis that replaces 5 with 10? Let me put the error messages next to each other:
required for the satisfaction of 'Area<units::quantity<units::unknown_dimension<units::exponent<units::isq::si::dim_length, -1, 1> >, units::scaled_unit<units::magnitude<units::base_power<long int>{2, units::ratio{-3, 1, 0}}, units::base_power<long int>{5, units::ratio{-3, 1, 0}}>(), units::unknown_coherent_unit>, double> >'
required for the satisfaction of 'Area<units::quantity<units::unknown_dimension<units::exponent<units::isq::si::dim_length, -1, 1> >, units::scaled_unit<units::magnitude<units::base_power<long int>{10, units::ratio{-3, 1, 0}}>(), units::unknown_coherent_unit>, double> >'
This looks like a character reduction of about 16%: somewhat middling.
If we reformat the error message so we can really understand the template hierarchy, it looks something like this:
required for the satisfaction of '
Area<
units::quantity<
units::unknown_dimension<
units::exponent<units::isq::si::dim_length, -1, 1>
>,
units::scaled_unit<
units::magnitude<
units::base_power<long int>{2, units::ratio{-3, 1, 0}},
units::base_power<long int>{5, units::ratio{-3, 1, 0}}
>(),
units::unknown_coherent_unit>,
double
>
>
So, the change-of-basis replaces two base_power
lines with one: certainly an improvement, but the error is still pretty challenging at the end of the day, because quantity
has a lot going on.
In principle, choosing a different basis should work. It's a vector space! :slightly_smiling_face: In practice, I'm curious how many places implicitly assume that "integer basis elements are prime numbers". If we went the change-of-basis route, I think success is more likely than not, but far from guaranteed.
Finally: we've identified a real cost to Magnitude which we hadn't realized when we started on this journey. This should cause us to question: is Magnitude really the best solution? Do the benefits outweigh the costs?
I think the answer is yes. Our "unit size" solution needs to be some representation of positive real numbers that fulfills certain requirements, including:
These requirements had previously manifested as three seemingly-unrelated bug reports. The case for the "vector space representation", as outlined in #300, still seems correct to me: I still don't know any other solution which actually meets these requirements.
The main cost we've identified is that it does make some compiler errors worse. In particular, power-of-10 errors are significantly worse, because we're comparing those two-base_power
-lines to a simple ratio
(and not just one base_power
line). The caveat here is that this only happens when our "unit size" abstraction "escapes" all the way to user-visible type names. One strategy to handle this would be to make this escape less common. Another would be to change the basis to shorten the errors.
So I think this cost makes it worth raising the question, but it looks to me like the answer is still yes.
These are my thoughts. You know the deep library internals far better than I---is there anything to the idea of hiding the magnitude for SI-prefixed units behind a strong type?
static_assert
is not a good solution to specify types in interfaces similarly to assert()
not being good for specifying function argument values in runtime. Both are about the quality of implementation rather than interfaces. Among others static_assert()
does not influence the name lookup and overload resolution.
Another issue with a static_assert
is that often it is hard to understand the source of the issue. Even though the text message seems user-friendly (i.e. "Can only convert same-dimension units"
) some compilers will still not provide what actually caused the issue (i.e. what the actual units were being converted).
Said that the readability issue is not connected to concepts at all. See: https://godbolt.org/z/YM3TYEc65.
Our prefixed units are strong type definitions, so the Magnitude is hidden behind that strong type.
The same is in case of mp-units. But sometimes user will face an undefined unit or dimension. It can be caused either by a bug in the source code (i.e. invalid quantity arithmetic) or just using a temporary intermediate variable that does not represent a correct quantity type (i.e. https://mpusz.github.io/units/use_cases/unknown_dimensions.html#temporary-results).
I think that it is much easier for the user to reason about units::scaled_unit<units::magnitude<units::base_power<long int>{10, units::ratio{-3, 1, 0}}>(), units::unknown_coherent_unit>
rather then units::scaled_unit<units::magnitude<units::base_power<long int>{2, units::ratio{-3, 1, 0}}, units::base_power<long int>{5, units::ratio{-3, 1, 0}}>(), units::unknown_coherent_unit>
. It is not about the number of characters (long text) but about understanding what is going on. In my talks, I often referred to another units library as a bad example of hard to understanding type:
error: static_assert failed due to requirement 'traits::is_convertible_unit<unit<ratio<3600000, 1>, base_unit<ratio<1, 1>,
ratio<0, 1>, ratio<1, 1>, ratio<0, 1>, ratio<0, 1>, ratio<0, 1>, ratio<0, 1>, ratio<0, 1>, ratio<0, 1> >, ratio<0, 1>,
ratio<0, 1> >, unit<ratio<5, 18>, base_unit<ratio<1, 1>, ratio<0, 1>, ratio<-1, 1>, ratio<0, 1>, ratio<0, 1>, ratio<0, 1>,
ratio<0, 1>, ratio<0, 1>, ratio<0, 1> >, ratio<0, 1>, ratio<0, 1> > >::value' "Units are not compatible."
Don't get me wrong. I ❤️ magnitude
! 😃It solves so many issues and opens the door to implementing new important features.
Now when we have the feature proven to work, we can think if we can simplify this for a typical end-user that does not have to be an expert in the field or a mathematician.
is there anything to the idea of hiding the magnitude for SI-prefixed units behind a strong type?
All user predefined types will always hide magnitude. However, when the dimensional analysis will end up with a unit that was not predefined the magnitudes will always be visible. In most cases, it will be a bug in users' code, but this library's main purpose is to actually detect and fix those errors. If there were no errors this library would not be needed 😉
Ah---I think I can drill down a bit on a key distinction between the libraries, which explains why the Magnitude surfaces in mp-units but not au (au is the second incarnation of Aurora's units library).
I initially thought the problem was that kilometers
wasn't a strong-type in mp-units. Now I see that's not what's going on. It's the 1 /
which turns the (known) unit of kilometers
into an unknown unit: remove the 1 /
, and you get a much more direct compiler error with no magnitudes! And this shows me the relevant difference between the libraries' approaches.
When faced with an unknown unit:
au::Pow<au::Kilo<au::Meters>, -1>
, with no magnitudes: we express our units as products of powers of the named units which formed them.In au, the only time a magnitude should "escape" to the type name is if we do an ad hoc product of a (presumably-named) unit, and a magnitude. For example, (feet * mag<3>())(10)
would give a Quantity which is equivalent to 10 yards, but whose Unit is essentially "feet times 3" (and the "3" would show up in the type name). This unit is not the same as yards, but is quantity-equivalent to yards, and quantities of each can be freely converted back and forth with zero cost.
Does this analysis sound right---that the recognition of "base units" in mp-units ultimately leads to more magnitudes showing up in unit type names?
Yes, you are right. mp-units framework always works on the list of exponents and their base units and only after that it tries to find a strong type with a downcasting facility. We can discuss that if you have any ideas on how to improve it,
I think that au behaves similarly to: https://github.com/mpusz/units/issues/351?
I think there are some similarities with #351, in that the named units on the input are the same as the ones that show up on the output. The main difference is that au
avoids repeating templates such as unknown_unit
in a nested fashion: instead, the types are more symmetric.
The key is that I invested heavily in generic "products-of-powers" metaprogramming infrastructure when rewriting the library. It handles both the "obvious" operations (i.e., products and powers), but also makes it very easy to define strict total orderings on participating types (for canonicalization), and automatically catch the most common failure cases (i.e., distinct types which compare equal). On this foundation, we build not only both Magnitude and Dimension (with minimal code), but also compound units.
In the case of units, IIRC (I'm not at my work computer right now), the type for the example discussed there would be something like UnitProduct<Kilo<Meters>, Pow<Hours, -2>>
, rather than something like unknown_unit<unknown_unit<kilometre, per<hour>>, per<hour>>
.
I think a better analogy for au
is #383 (which didn't exist when you wrote your last comment :slightly_smiling_face:). You know I was a huge downcasting fan this time last year when we spoke, but my thinking has evolved since then. I've learned to distinguish between the two component parts that give mp-units its enviably readable compiler errors: opaque typedefs (which make concise user-visible types possible) and downcasting "proper" (which creates a global registry of the "preferred" unit for each unit equivalence class).
Working through the downcasting implementation was one of my favorite experiences in C++: I loved the elegance, and the feeling when it "clicked" for me! But I've become suspicious of the goal itself of a global unit registry, no matter how it's implemented. I think we can retain the main benefits of downcasting, while fixing the problems that have emerged, via a combination of:
This way, the ad hoc "compound types" will tend to occur as intermediate computation results, and thus be seen by users somewhat less often---but when they are seen, their type names will be easily understood. To take the example in #383, freq = 1 / dt
: the 1 / dt
part will have units of inverse seconds (and if an error occurs there, somehow, that's the type the user will see), while it can be seamlessly assigned to freq
, at which point it will have units of hertz.
This is only a sketch of my current ideas on the matter. Unfortunately, it's my turn to be too busy to contribute much to mp-units! For now, anyway. So I can't commit to an effort to bring "generic products-and-powers of parameter packs" foundations to the library anytime soon, but I can tell you that this strategy has worked very well for us so far, and I'm really happy with the types it produces.
@chiphogg, thanks a lot for your comments and contributions!
Yes, as I wrote some time ago, more and more issues emerge with the usage of the downcasting facility, and I am bending towards removing it soon. But first, I came back to work on #281, which I see as a high priority that will resolve some existing issues and open the doors to new features that are not possible right now. Also, I see this as an important stone to be able to remove the downcasting facility in the current library (note that UNITS_DOWNCAST_MODE = 0
still does not work in the library) at all.
But we diverged here too much from the original subject of this issue which is to optimize the engine to prioritize magnitudes of 10
, 2
, and pi
before going for primes representation of any other possibility (that should be really rare).
But we diverged here too much from the original subject of this issue which is to optimize the engine to prioritize magnitudes of
10
,2
, andpi
before going for primes representation of any other possibility (that should be really rare).
I'm so easily distracted---I was focused only on replying to your most recent comments, and I had forgotten which issue this was! :sweat_smile: :laughing:
I still don't know any other solution which actually meets these requirements.
How about a symbolic algebra system? The same preservation logic au
applies to units, applied to the numbers.
So rather than converting the magnitude to a vector space representation with prime numbers as the basis, you keep its representation just like you'd write it down on paper. For comparison:
required for the satisfaction of 'Area<units::quantity<units::unknown_dimension<units::exponent<units::isq::si::dim_length, -1, 1> >, units::scaled_unit<units::magnitude<units::base_power<long int>{2, units::ratio{-3, 1, 0}}, units::base_power<long int>{5, units::ratio{-3, 1, 0}}>(), units::unknown_coherent_unit>, double> >'
required for the satisfaction of 'Area<units::quantity<units::unknown_dimension<units::exponent<units::isq::si::dim_length, -1, 1> >, units::scaled_unit<units::exponentiation<10, -3>, units::unknown_coherent_unit>, double> >'
This isn't necessarily a replacement to the current implementation. It could still be reached for when one of the std::intmax_t
values in a type of the SAS would overflow.
The current implementation ensures that mag<1000>()
and mag_power<10, 3>()
are equal (same type). This should not be lost when writing the SAS.
I've been working on a SAS. It's a handful of a problem.
To keep good error messages, perhaps it'd be simpler to wrap the magnitude such that units::exponentiation<10, -3>::value == units::magnitude<units::base_power<long int>{2, units::ratio{-3, 1, 0}}, units::base_power<long int>{5, units::ratio{-3, 1, 0}}
.
And just special case powers of 10 and 2, and keep using std::intmax_t
and units::ratio
. When symbolic bases come into play, units::magnitude
can resurface.
https://gcc.godbolt.org/z/c66dze1zP is my attempt. For rational exponents, what I use is
struct radical; // $ⁿ√x$.
I think it satisfies the points of https://github.com/mpusz/units/issues/377#issuecomment-1201526579, but not one of https://github.com/mpusz/units/issues/300:
- Representations must be unique for each real number (we don't want a proliferation of types for the same magnitude).
That's what mag
does, so it can be used instead of radical
.
Thanks @JohelEGP, I do not have much time to look into it now but definitely this is an open subject and area for improvement.
BTW, I really encourage you and everyone else to play a bit with the V2 as it starts to be usable now (at least some basic features work) and suggest all possible improvements you can find. Hopefully, I will be done with the entire library rewrite in two weeks from now (so before the Kona meeting). Any help with the rewrite is also welcomed as a lot of unit tests, examples, and docs have to be rewritten as well.
Most of the magnitudes is the user's code will be the power of
10
, only some will be the power of2
, and only a few a power ofpi
. Other magnitudes probably will nearly not exist. We should optimize for this case so the user does not end up with an error like:Also please note that this error comes from gcc10. gcc11 and gcc12 tend to shorten it making it impossible to know what we are dealing with:
This impacts user experience a lot.
See the code in: https://godbolt.org/z/E5b1c54zv.