Support abbreviated symbols

aurora-opensource / au

A C++14-compatible physical units library with no dependencies and a single-file delivery option. Emphasis on safety, accessibility, performance, and developer experience.

Apache License 2.0

322 stars 19 forks source link

Support abbreviated symbols #43

Closed chiphogg closed 9 months ago

chiphogg commented 1 year ago

We never migrated these upstream! It'd be nice to have them in place before we announce the library.

We could include the UDL inside each "au/units" file, but that makes me wary.

Having split out units into individual files makes the cost of adding a new unit virtually zero. Literals could change that, because we would have to worry about literal collision (e.g., farads and fahrenheit might each be tempted to use _F).
Some literals correspond to units that don't have an "au/units" file, e.g., _MPH, _nm, _kg.
We've been doing really well on the "don't pay for what you don't use" front, and I'd like to keep that up.

So here's a proposal.

Define the UDLs inside an "au/units/literals" file (or, perhaps, simply "au/literals"), exactly corresponding to the "au/units" file. So, either "au/units/literals/meters.hh", or "au/literals/meters.hh".
The literals file would automatically include the units file. So, "au/units/literals/miles_per_hour.hh" would give "au/units/miles.hh" and "au/units/hours.hh".
The single-file script would gain a new --literals argument which would act just like --units, but for the literals folder.
The pre-built single-file packages would include literals.

chiphogg commented 1 year ago

We should interpret this issue more broadly, and not wed ourselves to UDLs as the solution. What we're really trying to do is provide a nice way for people to express themselves concisely using unit symbols. @mpusz has expounded elsewhere on the downsides of UDLs, and they are indeed significant. Perhaps with constants (#90), there might be a more flexible and ergonomic way to include the units concisely.

avrahamshukron commented 12 months ago

Hi @chiphogg ! I'm really looking forward for this feature.

not sure if this is the correct place to discuss this, but I was trying to implement UDL for Au-based unit types, and I'm struggling to enforce safety at compile-time.

Lets assume I want _mV to return milli-volts represented as an int32_t (because the code is targeting embedded device) , i.e

using Voltage = au::QuantityI32<au::Milli<au::Volts>>;

The trouble is that UDLs for integral types are forced to accept unsigned long long as an argument:

constexpr Voltage operator""_mV(unsigned long long literal)
{
    if (literal > static_cast<unsigned long long>(std::numeric_limits<Voltage::Rep>::max()))
    {
        // Fail compilation somehow?
    }
    return au::milli(au::volts(static_cast<Voltage::Rep>(literal)));
}

But I can't find a way to enforce bound checking at compile-time. Do you have any idea how to do this?

chiphogg commented 12 months ago

Sure, I think I can help!

First: as the first post implies, this is the same approach we're taking inside Aurora. We're using UDLs, because that's what we had been used to from other units libraries (nholthaus and ATG-internal), and because @mpusz had not yet articulated their disadvantages. They're a fine stopgap solution for end users, though, as long as they can live with the downsides. I think the main ones that are relevant for end users (as opposed to units library authors) are:

Poor ability to select the "rep" type
Labor-intensive to define the implementations
Poor composability with prefixes and compound units

So if a project is willing to do the work to define them, they can be pretty handy for simple units.

I think there's an alternative way to define user-defined literals that inspects each character individually. I read about it in this blog post. I had never tried my hand at it before, but I was able to whip something up that seemed to work OK. See if this helps:

template <typename T, T Value>
T literal_value() {
    return Value;
}

template <typename T, T Value, char FirstDigit, char... OtherDigits>
T literal_value() {
    static_assert(FirstDigit >= '0' && FirstDigit <= '9', "Must supply only digits");
    constexpr T DIGIT_VALUE = FirstDigit - '0';

    constexpr T MAX_OK_VALUE = (std::numeric_limits<T>::max() - DIGIT_VALUE) / 10;
    static_assert(Value <= MAX_OK_VALUE, "Overflowed literal");

    return literal_value<T, Value * 10 + DIGIT_VALUE, OtherDigits...>();
}

using Voltage = au::QuantityI32<au::Milli<au::Volts>>;

template <char... Digits>
Voltage operator""_mV() {
    return au::milli(au::volts)(literal_value<int32_t, 0, Digits...>());
}

TEST(MilliVoltsLiteral, BehavesCorrectly) {
    EXPECT_THAT(-2147483647_mV, au::SameTypeAndValue(au::milli(au::volts)(int32_t{-2147483647})));
    EXPECT_THAT(-200_mV, au::SameTypeAndValue(au::milli(au::volts)(int32_t{-200})));
    EXPECT_THAT(15_mV, au::SameTypeAndValue(au::milli(au::volts)(int32_t{15})));
    EXPECT_THAT(2147483647_mV, au::SameTypeAndValue(au::milli(au::volts)(int32_t{2147483647})));

    // This will be a compile time error:
    // 2147483648_mV;

    // This should work, but it doesn't.  (The `-` sign is not part of the literal.)
    // EXPECT_THAT(
    //     -2147483648_mV, au::SameTypeAndValue(au::milli(au::volts)(int32_t{-2147483648})));
}

The only downside is that you won't be able to represent the most-negative value of your integral type, because only the digits are part of the literal, and there isn't a corresponding positive value that fits in the type.

Fair warning: I haven't checked the compile time impact! :grin: But this should give you both the rigor and the usability you're looking for.

chiphogg commented 9 months ago

FYI @avrahamshukron, this is now done; here are the usage docs. It'll probably be a week or two before we cut the next release, but if you're OK working from main, then you should be all set!

avrahamshukron commented 9 months ago

Thank your @chiphogg ! I'll go read the docs. BTW I implemented your suggestion above and it worked quite nicely. I couldn't measure a meaningful difference in compile time when compiling the entire project, so it works well for us now.