Add mangling for _FloatNx and bfloat16_t

jicama commented 2 years ago

The current mangling is missing entries for C23 _FloatNx and C++23 std::bfloat16_t; I propose adding another character after DF to indicate these variants.

There's also the related question about how we want to handle float128 and float80, and also int64 and int128, given the new model where the explicitly sized types are extended types rather than aliases for standard types of the same size.

My inclination is to change __float128 to be another name for std::float128_t, and perhaps mangle std::float128t as 'g' for compatibility with any existing code. And change __float80 to be another name for _Float64x, mangled as DFx64.

Similarly, I'd treat int64 as _BitInt(64), mangled as _DB64, and treat int128 as _BitInt(128), still mangled as 'n'.

Thoughts?

jakubjelinek commented 2 years ago

For FloatNx, perhaps DF x rather than DFx _ or DF x without the underscore?

Note, float128 is already mangled differently on different targets, g most of the time but e.g. on powerpc u9ieee128 because g was misused for IBM extended.

jicama commented 2 years ago

Sure, terminating the number with x instead of _ would also work, I don't have much of a preference.

On ppc we probably want to transition to DF128_ with mangling aliases for u9__ieee128. Is there a C23 type that names IBM double-double?

jwakely commented 2 years ago

Is there a C23 type that names IBM double-double?

No, the _FloatN and _FloatNx types have to use IEC 60559 interchange formats, and double-double isn't one of those. There is no other extended floating type for double-double either.

jakubjelinek commented 2 years ago

As for changing the __float128 or __ibm128 mangling, wouldn't that cause major problems for libstdc++ which relies on those to represent long double in the other ABI on powerpc? Or perhaps __ieee128 could keep mangling as before and ditto __ibm128 and __float128 which is now a macro defined to __ieee128 could be changed to _Float128 and other architectures __float128 and _Float128 would be the same type? Still, it would be quite large incompatible change, because code with implicit conversions of __float128 to float/double would be invalid while it was accepted before.

rjmccall commented 2 years ago

What is __int64? I thought that was just an MSVC feature; is it a standard type somewhere else?

Do I understand correctly that std::bfloat16_t is a new type which is not storage-only? So basically doing to __bf16 what _Float16 did to ARM __fp16. In that case, I agree that it needs a new mangling. My impression is that the bfloat format has a fair amount of momentum behind it, and the variant-format-at-same-width situation is unlikely to repeat at any other widths; I would suggest just burning DH for this type.

I like the idea of DF <number> x for the _FloatNx types.

jakubjelinek commented 2 years ago

__int64 is a Microsoft alias for long long, see int8-int16-int32-int64 But the ABI says it mangles the same as long long (x), so it can't be a distinct type.

Right now __bf16 at least in GCC for ARM, AArch64 and i?86/x86_64 is a storage only type with the bfloat16 format. My understanding is that if std::bfloat16_t is supported in C++ (i.e. when __STDCPP_BFLOAT16_T__ is predefined to 1), then having a storage only type (which disallows conversions from/to it or arithmetic operations on it and one needs to use only intrninsics on it) is not good enough, P1467R9 says "If the implementation supports an extended floating-point type with the properties, as specified by ISO/IEC/IEEE 60559" and then says what properties the format has. So, the mangling for std::bfloat16_t is to agree on mangling for an actually usable floating point type with those properties.

rjmccall commented 2 years ago

a storage only type (which disallows conversions from/to it or arithmetic operations on it and one needs to use only intrninsics on it)

For what it's worth, this is not what it means to be storage-only, at least for __bf16. Under the ARM spec, __bf16 pr-values are immediately converted to float (or potentially to wider types) as a usual unary conversion. The type therefore does have conversions and even (in a way) supports arithmetic, albeit through promotion. Anyway, I do agree that that type behavior doesn't seem to be allowed under the C++ specification and so std::bfloat16_t must be a different type from __bf16.

Presumably implementors will need to use promotion/truncation to emulate bfloat16 arithmetic, possibly using excess-precision emission; I don't think native bfloat16 arithmetic is widely supported in hardware.

thiagomacieira commented 2 years ago

I asked @dkolsen-pgi tonight and I got an ambiguous answer. On one hand, the paper does seem to be clear that the standard std:::bfloat16_t should be used only if the compiler is implementing arithmetics using bfloat16_t (or "as-if", meaning probably promotion to float then truncation after every operation). On the other, it does not seem to be the intention, as it's meant to match what hardware does. It's also a grey area, as a compiler could perform truncation in some configurations but not in others (think -ffast-math).

David, would you comment?

dkolsen-pgi commented 2 years ago

My intention as one of the authors of P1467 is that compilers can implement std::bfloat16_t by doing all arithmetic in 32-bit float. What's important is that code using std::bfloat16_t compiles and gets correct answers (for a reasonable definition of correct).

A type with the bfloat16 format that doesn't support any arithmetic and requires explicit conversions to/from other types can't be named std::bfloat16_t and should be mangled differently than std::bfloat16_t.

Existing __float128 types do not follow the rules for extended floating-point types described in P1467. Compilers should leave the mangling for that type unchanged. _Float128 and its alias std::float128_t should be a new type distinct from __float128.

jakubjelinek commented 2 years ago

One question for std::bfloat16_t is whether when signaling NaNs are normally honored for other types it needs to be honored for std::bfloat16_t too on e.g. conversions from that type to float/double/long double/std::float{32,64,128}_t or vice versa. I think it should, but e.g. Intel AVX512-BF16 ISA instructions don't raise any exceptions. Of course for -ffast-math like exceptions it doesn't need to and std::bfloat16_t conversion to std::float32_t can be just shift of the bits by 16 (or vector permutation).

rjmccall commented 2 years ago

IIRC the standard has always been vague about exceptional behavior and especially signaling NaNs, so std::bfloat16_t having different behavior from other FP types on the same target is potentially fine.

I seem to have misread the docs about __bf16 — I had thought that it was specified to be promoted to float as an arithmetic conversion, but it actually looks like it doesn't allow arithmetic at all. The promotion to float that I was thinking of seems to just be a behavior of ARM __fp16 (but not _Float16). So we still can't just make std::bfloat16_t a typedef of __bf16, but it's for a different reason than I thought.

rjmccall commented 2 years ago

@dkolsen-pgi If you're interested, I've gotten some feedback on the design of adding std::bfloat16_t as an arithmetic type from one of our numerics experts at Apple, Steve Canon. It's somewhat off-topic for the Itanium ABI, since on the ABI level we can of course implement whatever the language decision here is, but I'll go ahead and post it here on the grounds that splitting the conversation doesn't really serve anyone.

Steve is concerned that adding this type as an arithmetic type might serve to be an attractive nuisance. Because the precision of bfloat16 is so limited, controlling when truncation back to bfloat16 occurs is of paramount practical importance to bfloat16 users. The normal semantics of an arithmetic type in C and C++ encourage the independent evaluation of operations, which would require an implicit truncation back to bfloat16 on every intermediate result. That would have catastrophic effects on both the precision and the performance of typical bfloat16 code. For example, on the performance side, typical hardware support is built around complex fused operations (e.g. float32 += bfloat16 * bfloat16 + bfloat16 * bfloat16, with all intermediate results computed in float32) that it would not be correct to pattern-match from independent operations.

Now, C and C++ do allow excess precision evaluation (C 6.5p8; C++ [expr.pre]p6), and Steve and I think that that might fix this problem. But we'd really need to force excess precision evaluation in order to get acceptable results; otherwise, allowing arithmetic is really just encouraging people to write code that is effectively incorrect. And even then there's definitely risk that someone might e.g. accumulate the intermediate results of a loop in std::bfloat16_t instead of in float.

jakubjelinek commented 2 years ago

Another possible std::bfloat16t mangling is DF16b instead of `DFb16`.

rjmccall commented 2 years ago

There have been reports that GCC is now mangling __bf16 with the new std::bfloat16_t mangling. Per David's comment above, that is not correct, because __bf16 is a storage-only type. Unless GCC also changed the semantics of __bf16 to match the required semantics of std::bfloat16_t, __bf16 needs to continue to use the old mangling.

jakubjelinek commented 2 years ago

In GCC on i?86/x86_64 bf16 is no longer a storage-only type, it is fully usable extended floating point type as per P1467R9, in latest gcc trunk including all the library support (so, on i?86/x86_64 linux with glibc 2.26 or later, all the P1467R9 types are supported except from_chars/to_chars on std::float128_t, which has a posted patch but not applied yet). GCC now also implements -fexcess-precision=standard even for C++ rather than just for C as in GCC12 and earlier, so unless -fexcess-precision=16 is requested, both std::float16_t and std::bfloat16_t arithmetic will use IEEE single as excess precision. On aarch64 and arm, bf16 is still a storage-only type and so it doesn't mangle there using the above proposed mangling. I'll ping the ARM maintainers about it.

rjmccall commented 2 years ago

Okay, thank you for the clarification. That seems to be in line with the discussion here, assuming that GCC is comfortable with the semantics break for __bf16 with prior releases on x86 targets.

jakubjelinek commented 2 years ago

https://godbolt.org/z/5x1ErMYnz as an example

jakubjelinek commented 2 years ago

And https://godbolt.org/z/cfjdhMEGa which shows also ostream support.

jakubjelinek commented 2 years ago

Note, while the library hardcodes that std::float{16,32,64,128}_t are _Float{16,32,64,128}, for std::bfloat16_t it actually uses decltype(0.0bf16), so whether that is the same as __bf16 or some other type is something that can be decided on each arch separately.

tschuett commented 2 years ago

IIRC the standard has always been vague about exceptional behavior and especially signaling NaNs, so std::bfloat16_t having different behavior from other FP types on the same target is potentially fine.

I seem to have misread the docs about __bf16 — I had thought that it was specified to be promoted to float as an arithmetic conversion, but it actually looks like it doesn't allow arithmetic at all. The promotion to float that I was thinking of seems to just be a behavior of ARM __fp16 (but not _Float16). So we still can't just make std::bfloat16_t a typedef of __bf16, but it's for a different reason than I thought.

Exactly: https://github.com/llvm/llvm-project/issues/58465

thiagomacieira commented 1 year ago

Are _Float128 / std::float128_t really meant to be different types from __float128?

Both Clang and GCC mangle __float128 as g. Clang trunk doesn't support _Float128 yet, but GCC 13 does and mangles that as DF128_. It's also a completely separate type.

Is this meant to be?

dkolsen-pgi commented 1 year ago

Are _Float128 / std::float128_t really meant to be different types from __float128?

Yes. _Float128 / std::float128_t must follow the rules of an extended floating-point type. __float128 is not an extended floating-point type, since it predates the invention of floating-point types. __float128 can follow whatever rules the implementation wants since it is a compiler extension. But I think it mostly mimics standard floating-point types. The biggest difference between the two is in implicit conversions. Conversions from __float128 to double are implicit, but conversions from _Float128 to double must be explicit.

A compiler could choose to make __float128 and _Float128 different names for the same type. But that would require a change in behavior for __float128, which might break existing code that uses __float128. I would not recommend that approach.

asb commented 1 year ago

I apologise that I'm not particularly familiar with the development process for this ABI.

The RISC-V psABI currently specifies its mangling rules just by referencing this document (i.e. we have no custom mangling). I'd like to add coverage for __bf16 to the psABI document and would like to keep up that practice, which would require a resolution to this PR.

Might it make sense to split out std::bfloat16_t so that can be resolved separately to __floatNx? It also seems that (depending on the resolution of this discussion), it may make sense to reference __bf16 explicitly in the text, just as this PR also references _FloatN?

codemzs commented 1 year ago

@jicama @jakubjelinek It seems DF16b is proposed as the mangling for std::bfloat16 but I was curious to know where are we in the process of finalizing it? Context: https://discourse.llvm.org/t/rfc-c-23-p1467r9-extended-floating-point-types-and-standard-names/70033/14?u=codemzs

joshua-arch1 commented 1 year ago

How is the mangling for std::bfloat16 going now? Since adding ABI information for bf16 mentioned in https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/367 is blocked by the mangling rule, I think we need to finalize this patch ASAP, at least the bf16 part.

stuij commented 1 year ago

Adding another piece of coal to the fire: At Arm we are also waiting for this patch to be finalised. Or at least the bfloat part of it. We eventually agreed internally that at Arm we won't be changing our current __bf16 mangling, and are happy for it to let go of being storage-only, and grow support for operations to match std::bfloat_t.

I'd like to note this deviation in the ABI, but currently there's nothing to reference.

codemzs commented 1 year ago

@rjmccall @jicama Even our team at Microsoft is eagerly awaiting the completion of this patch, specifically the std::bfloat16_t mangling. We would greatly appreciate it if you could kindly update us on the current status of this process and any estimated timeline for its completion.

rjmccall commented 1 year ago

There's consensus to accept this; sorry for the delay.

itanium-cxx-abi / cxx-abi

Add mangling for _FloatNx and bfloat16_t #147