Closed mpusz closed 1 year ago
As mentioned elsewhere, I'm actually not a big fan for .number()
anyway. Can we have a number_in<units>()
accessor instead/additionally? That said, the lvalue accessor is definitely the worst of the pack, so you have my blessing to make it go (first) :-).
First, please let me thank you for a lot of great feedback you provided in the last few days! That helps a lot and shines the light on interesting problems. Thanks!!!
As I just stated in #239, representation-type conversion may introduce data truncation. If number_in<units>
for that specific unit was truncating the data, it probably should not be allowed without an explicit cast. As the quantity_cast
is still needed here, I am not sure if we are winning much:
quantity_of<isq::distance[m]> auto q = get();
auto num1 = q.number_in<q.unit>(); // give me what you have
auto num2 = q.number_in<mm>(); // convert in a safe direction
// auto num3 = q.number_in<km>(); // compile-time error if not a floating-point representation
auto num3 = quantity_cast<km>(q).number_in<km>(); // now OK
Now the question is if we want such a feature and if it should replace the current number()
or if it should be an additional overload set.
I like that overload set you provided, with exactly the semantics you describe. To me, q.number_in<mm>()
is a gain over quantity_cast<mm>(q).number()
, because nothing prevents you from moving the .number()
call away from the cast.
The fact that the third example requires a "double cast" is really a plus to me: I can get assurance that q.number_in<km>()
without a cast does not truncate, even if I do not know the exact unit that q
was specified in. In generic code, a compile-time error may be what I want. I can then deal with the situation with a double cast, which in many cases may well be rather quantity_cast<float>(q).number_in<km>()
than quantity_cast<km>(q).number_in<km>()
. After all, if my generic code is supposed to simultaneously deal with quantities spanning a wide dynamic range, I should probably really use floating point arithmetic.
Given that q.number_in<q.unit>()
is guaranteed to be equivalent to .number()
I believe this could indeed replace the latter completely, but I don't care too much. (with #411, we currently anyway have public access to number_
).
(with https://github.com/mpusz/units/issues/411, we currently anyway have public access to number_
That is right, but it should be considered an implementation detail rather than an official interface. This is the same as for std::array
. You have a public member, but you probably never used it, right? If the definition of a structural type will change to be less restrictive we will use a private member again.
I even consider renaming the member object to something like __private__implementation__number__
so no one dares to use it directly 😉 The only problem is that it would be undefined behavior, but who cares... 😜
I agree. Because of q.number_in<q.unit>()
, I still tend to vote for the removal of .number()
, even if number_
is inaccessible or an implementation detail. Though as I said, no strong feelings here.
OK, so the last thing to note here. number()
returns a (const
) reference, whereas number_in<>()
will always require returning by value which may involve a copy/move operation. Some people may be unhappy because of that. Should we care about that?
If the number type is heavy, then we may care. Numbers usually shouldn't be, but who knows, there could be a use-case. So you convinced me. Let's keep it.
I just realized one more thing. number_in<>()
is a member function template whose template parameter is not being deduced. This means that in generic code, for dependent names, we will have to always provide a template
disambiguator(imagine
std::get<>()being a member function of
std::tuple). Do we want such an interface? Or maybe we should provide a non-member function (like
std::get<>()`)?
Ah, good point. As we seem to move towards preferring values over types (see boost::hana), the "unit specification" (cast specification restricted to units) that we supply could actually be a value rather than a type (though I don't remember off the top of my head if that is the case in current V2 for units). Then, q.number_in(unit_spec)
could be an alternative? That would be my preferred form, but I would be willing to accept a freestanding function or even typename
everywhere too.
the "unit specification" (cast specification restricted to units) that we supply could actually be a value rather than a type (though I don't remember off the top of my head if that is the case in current V2 for units).
Yes, it is. Users typically do not notice because of the fact that we name a type and an instance with the same identifier.
To make it explicit:
quantity_cast<quantity<isq::speed[km/h]>>(q); // takes type
quantity_cast<km/h>>(q); // takes value
quantity_cast<isq::speed>>(q); // takes value
quantity_cast<isq::speed[km/h]>>(q); // takes value
quantity_cast<int>(q); // takes type
In general, units, quantity_spec, and references are always passed as NTTP objects.
Then, q.number_in(unit_spec) could be an alternative?
Yes, I like this idea!
To summarize:
number()
number_in(Unit auto)
member funcitonDone! :-)
However, this might be a good place to discuss one more issue. Currently, quantity
and quantity_point
behaves like std::optional
and std::variant
. Namely, they return an rvalue reference for rvalue reference specified member accessor functions. Recently many people say that this is a bad design and may lead to dangling references (i.e. see the range-for safety discussion https://wg21.link/p2644).
Alternatively, we could return by values/copies for those, which, on the other hand, could be perceived as a broken overload set as sometimes you get a reference (for const lvalues) or an rvalue (for rvalues).
To make it clear. Here is what we have now:
And here is a possible alternative:
[[nodiscard]] constexpr const rep& number() const& noexcept { return number_; }
[[nodiscard]] constexpr rep number() && noexcept { return std::move(number_); }
[[nodiscard]] constexpr rep number() const&& noexcept { return std::move(number_); }
What is the right way to do here?
I'm using this library in production, but I've avoided constructors and accessors entirely, wrapping everything in a pair of templated functions. Their impelmentations are icky, but the user-facing part is
millimeter_t foo = fromCount<millimeter_t>(42.0); // 42.0 mm
fTakingMillimeters(getCount<millimeter_t>(foo)); // Calls fTakingMillimeters(42.0).
Being strict about only using fromCount<Unit>
and getCount<Unit>
protects me against things like this: If we have
using Dist_mm = millimeter_t;
using Vol_mm3 = decltype(pow<3>(std::declval<Dist_mm>()));
using Vol_mL = millileter_t;
then
-Vol_mm3 vol;
+Vol_mL vol;
...
fTakingMM3(static_cast<double>(vol)); // The change to Vol_mL silently breaks this line.
So instead I have
-Vol_mm3 vol;
+Vol_mL vol;
...
fTakingMM3(getCount<Vol_mm3>(vol)); //< Compile error: vol is not a Vol_mm3.
@BenFrantzDale, thanks for sharing!
I do not believe that fTakingMM3(static_cast<double>(vol));
actually compiles as we do not provide a conversion operator to the representation type, but I see your point. I think that q.number_in(Unit)
should be a good solution for you in this case?
Yes. number_in<units>()
sounds great.
I guess it's the other way around. I'd encourage you to consider dropping the explicit
c'tor and replace it with a factory function. It's so easy to explicitly construct things (std::vector<millimeters_t> sizes; sizes.emplace_back(42.0);
).
Related: right now getting the count of a percentage gives the scalar value. So if you add that, be sure that (25_pct).number_in<pct_t>() == 25.0
not == 0.25
.
Closely related (maybe I should make a separate ticket): I think ceil
is a hole in the unit safety and should be removed for dimensioned types. After all, what's the ceiling of my height?
I could see ceil_in<units>(x)
as shorthand for fromCount<units>(ceil(x.number_in<units>()))
.
Yeah, I also heard this in the ISO C++ Committee from one person. On the other hand, the rest of the room would like to have a traditional way of doing things (via constructors).
Yeah, I also heard this in the ISO C++ Committee from one person. On the other hand, the rest of the room would like to have a traditional way of doing things (via constructors).
I believe it, but so far, I've found that any holes in unit safety let bugs through. With a safe getter and factory function and literals, it's pretty darn solid.
maybe I should make a separate ticket
Please do
In the V2 framework you will be able to do something like that std::vector<quantity<si::milli<si::metre>>> sizes; sizes.emplace_back(42.0 * mm);
which is terse and safe. If that will work, I think we can consider removing the constructor taking a raw number. But this should be a separate issue as well 😉
Related: right now getting the count of a percentage gives the scalar value. So if you add that, be sure that (25_pct).number_in
() == 25.0 not == 0.25.
I am not sure what you mean by that. Could you please provide more info? The following works:
constexpr dimensionless<one> q{0.25};
static_assert(quantity_cast<percent>(q).number() == 25);
@BenFrantzDale --- I think your getters/setters use the correct unit-safe pattern, but how do you prevent people from calling the constructors?
OK, so the last thing to note here.
number()
returns a (const
) reference, whereasnumber_in<>()
will always require returning by value which may involve a copy/move operation. Some people may be unhappy because of that. Should we care about that?
An alternative would be to give the "direct access" version of the function a different name. In Au, our value-semantic copying accessor (to get the value of a quantity q
in units units
) is spelled q.in(units)
. This will perform a conversion if units
is not equivalent to the units of q
, and if it's safe to do so. And the underlying-number accessor is spelled q.data_in(units)
. This will not compile if the units are inequivalent, but it gives a (const or mutable, as appropriate) handle to the underlying stored variable if they are equivalent. See aurora-opensource/au#28 for more details.
This means it's OK to do the following, for example:
auto x = inches(3);
++(x.data_in(inches));
EXPECT_EQ(x, inches(4));
Stepping back, and giving the bigger picture on this issue: I can share some real-world implementation experience.
When we were first writing Aurora's units library, folks were worried that it would be too burdensome to have to name the unit at the callsite. I said I'd consider adding a "just give me the underlying value!" function if I got a compelling use case. Since then, we have had years of experience with widespread use in production, and I have yet to hear a single complaint. The pattern works, and it makes code both safer and more readable.
So I think it'll be great to remove all instances of .number()
, and replace them with .number_in(units)
. (You can add something like .stored_number_in(units)
if you want a safe handle, similar to Au's .data_in(units)
.)
By the way: mp-units is the very first units library I have seen (other than Au) which takes these unit-safe interfaces seriously. This is thrilling! I am heartily encouraged by this change.
I said I'd consider adding a "just give me the underlying value!" function if I got a compelling use case. Since then, we have had years of experience with widespread use in production, and I have yet to hear a single complaint.
Am I correct that Aurora works only with C++ arithmetic (fundamental) types? Those are cheap to copy. For me, the most important reason to leave a raw accessor is to be able to access the underlying storage by const reference. It might be important if someone uses some untrivial representation type.
Yep --- only arithmetic (fundamental) types for now. Expanding this is in-scope for us, however; we're tracking it in aurora-opensource/au#52.
(BTW, nice work in mp-units on explicitly specifying a concept for the representation type!)
nice work in mp-units on explicitly specifying a concept for the representation type
Thanks, but still, we need to do a better job here. See: https://github.com/mpusz/units/blob/622b3e3cbd21506d53fba4d97204fac6ab108aef/src/core/include/mp_units/bits/quantity_concepts.h#L105-L108.
Coming back to accessing raw data (I just asked myself how to do it in code using the library).
I'm not able to oversee the consequences, but the most natural thing from the users perspective would seem (for me), using operator/
for this:
I use operator*
to form the bond between scalar and unit, e.g:
constexpr auto tencm = 10 * cm;
Consequently, removing the unit would be the reverse operation:
static_assert( 10 == tencm / cm);
... which, when written on paper this way, would read aloud as "value of variable tencm in cm". Seeing it this way, I'd also intuitively expect:
static_assert( 100 == tencm / mm);
Interestingly enough, I can already do something quite similar:
static_assert( 10 == tencm / (1*cm));
, then again
static_assert( 100 == tencm / (1*mm));
will not work.
As already stated, the consequences this would have are completely in the mist for me, but I leave you with this food for thought: IMO code would be easily readable when it looks like this:
// altitude being a mp-units type using e.g. units::isq::si::fps::foot
double alt_in_feet = altitude / ft;
Coming back to accessing raw data (I just asked myself how to do it in code using the library). I'm not able to oversee the consequences, but the most natural thing from the users perspective would seem (for me), using
operator/
for this: I useoperator*
to form the bond between scalar and unit, e.g:constexpr auto tencm = 10 * cm;
Consequently, removing the unit would be the reverse operation:
static_assert( 10 == tencm / cm);
... which, when written on paper this way, would read aloud as "value of variable tencm in cm". Seeing it this way, I'd also intuitively expect:
static_assert( 100 == tencm / mm);
Interestingly enough, I can already do something quite similar:
static_assert( 10 == tencm / (1*cm));
, then again
static_assert( 100 == tencm / (1*mm));
will not work.
As already stated, the consequences this would have are completely in the mist for me, but I leave you with this food for thought: IMO code would be easily readable when it looks like this:
// altitude being a mp-units type using e.g. units::isq::si::fps::foot double alt_in_feet = altitude / ft;
On the one hand I agree, and I do see the elegance of 10.0_mm/s
. On the other hand, when it’s not literals, I really like having “I have a double
that is in mm” and “I have strong mm, please give me the double
” be distinct from “Take x
and multiply it by mm
” and “Take y
and divide it by mm
.” I really like the error messages that distinction produce. I started using / 1.0s
to unpack chrono durations and found it to be unintuitive. Especially if the thing I’m trying to unpack isn’t what I think it is, then I could get foo/s
as the result type not an immediate error.
Food for thought.
Expanding on @BenFrantzDale's comment: what is (72 * cm) / m
? Many people's intuition is that this should be 0.72
, but I think that's wrong. It should be 72 * (cm / m)
, which is equivalent to percent(72)
. 0.72 has two contributions: the original value of 72, and a division by 100. The "divide by 100" information is carried in the type, and we should leave it there until somebody makes an explicit request to perform a conversion. (For one thing, we might perform other operations later, and keeping it in the type prevents us from losing information.)
What's really interesting is that the notion of "value" becomes ambiguous for every dimensionless unit, except the "unit one". There are two perfectly reasonable candidates for "the" value of such a quantity. A consequence which follows is that rather than try to guess which one the user meant, this kind of implicit conversion should be forbidden, as I explained here: https://github.com/nholthaus/units/issues/301#issuecomment-1381297235
The upshot is that while the "dividing method" really is creative, elegant, and intuitively appealing, I don't think it works out in practice as an API... whereas .number_in(units)
doesn't suffer from this problem.
Yes, this is exactly what I wanted to write here. The fact that the std::chrono::duration / std::chrono::duration
returns std::common_type_t<Rep1,Rep2>
is a huge problem with value precision. Consider dividing a floating-point duration in picoseconds by one in seconds. There will be a huge data truncation there. mp-units behaves differently. In such a case, it does not return a raw representation type but a dimensionless quantity with a proper unit assigned.
Also, to extend on the @chiphogg comment. mp-units provides implicit conversions for dimensionless quantities only if they are expressed in a unit one
:
As @JohelEGP noticed in https://github.com/mpusz/units/issues/411#issuecomment-1362809717, current accessors remove all encapsulation. Should we remove lvalue overload so the only remaining would be: