[API Proposal]: BFloat16

iamcarbon commented 9 months ago

Background and motivation

The bfloat16 type provides the same number range as the 32-bit IEEE 754 single-precision floating point type, but with a reduced precision (24 bits -> 8 bits). This is useful for machine learning to improve memory utilization, and can be used to accelerate AI workloads via AVC-512 BP / and ARMv8.6-A instructions.

Adding this type would allow us to implement these new instructions sets, and provide a common base type for various machine learning libraries.

API Proposal

namespace System.Numerics
{
    public readonly struct BFloat16
      : IComparable,
        IComparable<BFloat16>,
        IEquatable<BFloat16>
    {
        public static BFloat16 Epsilon { get; }
        public static BFloat16 MinValue { get; }
        public static BFloat16 MaxValue { get; }

        // Casting
        public static explicit operator BFloat16(float value);
        public static explicit operator BFloat16(double value);
        public static explicit operator float(BFloat16 value);
        public static explicit operator double(BFloat16 value);

        // Comparison
        public int CompareTo(object value);
        public int CompareTo(BFloat16 value);
        public static bool operator ==(BFloat16 left, BFloat16 right);
        public static bool operator !=(BFloat16 left, BFloat16 right);
        public static bool operator <(BFloat16 left, BFloat16 right);
        public static bool operator >(BFloat16 left, BFloat16 right);
        public static bool operator <=(BFloat16 left, BFloat16 right);
        public static bool operator >=(BFloat16 left, BFloat16 right);

        // Equality
        public bool Equals(BFloat16 obj);
        public override bool Equals(object? obj);
        public override int GetHashCode();

        // ToString override
        public override string ToString();
    }
}

API Usage

BFloat16 bf16 = (BFloat16)1.0f;

Alternative Designs

No response

Risks

No response

ghost commented 9 months ago

Tagging subscribers to this area: @dotnet/area-system-numerics See info in area-owners.md if you want to be subscribed.

Issue Details

### Background and motivation The bfloat16 type provides the same number range as the 32-bit IEEE 754 single-precision floating point type, but with a reduced precision (24 bits -> 8 bits). This is useful for machine learning to improve memory utilization, and can be used to accelerate AI workloads via AVC-512 BP / and ARMv8.6-A instructions. Adding this type would allow us to implement these new instructions sets, and provide a common base type for various machine learning libraries. ### API Proposal ```csharp namespace System { public readonly struct BFloat16 : IComparable, IFormattable, IComparable, IEquatable, IConvertible, ISpanFormattable, IUtf8SpanFormattable { public static readonly BFloat16 MinValue; public static readonly BFloat16 MaxValue; public static bool IsNegative(BFloat16 h); public static BFloat16 Parse(string s); public static BFloat16 Parse(string s, NumberStyles style); public static BFloat16 Parse(string s, NumberStyles style, IFormatProvider provider); public static BFloat16 Parse(string s, IFormatProvider provider); public static BFloat16 Parse(ReadOnlySpan s); public static BFloat16 Parse(ReadOnlySpan s, NumberStyles style); public static BFloat16 Parse(ReadOnlySpan s, IFormatProvider provider); public static BFloat16 Parse(ReadOnlySpan s, NumberStyles style, IFormatProvider provider); public bool TryFormat(Span destination, out int charsWritten, ReadOnlySpan format, IFormatProvider provider); public static bool TryParse(string s, out BFloat16 result); public static bool TryParse(string s, NumberStyles style, IFormatProvider provider, out BFloat16 result); public static bool TryParse(ReadOnlySpan s, out BFloat16 result); public static bool TryParse(ReadOnlySpan s, NumberStyles style, IFormatProvider provider, out BFloat16 result); public int CompareTo(object value); public int CompareTo(BFloat16 value); public bool Equals(BFloat16 obj); public override bool Equals(object obj); public override int GetHashCode(); public TypeCode GetTypeCode(); public string ToString(IFormatProvider provider); public string ToString(string format); public string ToString(string format, IFormatProvider provider); public override string ToString(); public static explicit operator BFloat16(float value); public static explicit operator float(BFloat16 value); public static bool operator ==(BFloat16 left, BFloat16 right); public static bool operator !=(BFloat16 left, BFloat16 right); public static bool operator <(BFloat16 left, BFloat16 right); public static bool operator >(BFloat16 left, BFloat16 right); public static bool operator <=(BFloat16 left, BFloat16 right); public static bool operator >=(BFloat16 left, BFloat16 right); } } ``` ### API Usage ```csharp BFloat16 bf16 = 1.0f; ``` ### Alternative Designs _No response_ ### Risks _No response_

Author:	iamcarbon
Assignees:	-
Labels:	`api-suggestion`, `area-System.Numerics`, `untriaged`
Milestone:	-

MichalPetryka commented 9 months ago

This should probably expose the whole API surface that Half has, including all the operators like addition and such even if they're not accelerated by most hardware.

iamcarbon commented 9 months ago

@MichalPetryka Updated to implement the IFloatingPoint interface, along with its operators. These can likely also forward to MathF / float, like Half by default.

MichalPetryka commented 9 months ago

Updated to implement the IFloatingPoint interface, along with its operators

You've missed IMinMaxValue<BFloat16>, which Half has.

iamcarbon commented 9 months ago

The proposal has been updated to include the IMinMaxValue interface. Note: the API is limited to public members. There are various INumber and IFloatingPoint members that are not listed, but will need explicit implementations to participate in the generic math system. @MichalPetryka Let me know if you spot any other missing public members.

MichalPetryka commented 9 months ago

The proposal has been updated to include the IMinMaxValue interface. Note: the API is limited to public members. There are various INumber and IFloatingPoint members that are not listed, but will need explicit implementations to participate in the generic math system. @MichalPetryka Let me know if you spot any other missing public members.

Unary Negation Operators seems to have the unary plus.

iamcarbon commented 9 months ago

Unary Negation Operators seems to have the unary plus.

Fixed.

huoyaoyuan commented 9 months ago

This should probably expose the whole API surface that Half has, including all the operators like addition and such even if they're not accelerated by most hardware.

I don't think mathematic functions should be implemented. They are likely not supported by hardware, nor required by any specification. The first version of Half in .NET 5 is only a transport type, with no IEEE754 function implemented.

I'd expect it to implement only conversion operators, and basic arithmetic operators only:

// comparable, equatable, parsing and formatting omitted
IMinMaxValue
IBinaryNumber
IFloatingPoint

iamcarbon commented 9 months ago

I believe there's still value implementing the Trigonometric & Hyperbolic functions as this type maintains the full Float32 range.

Converting a BFloat16 to a Single can also be done in a few shift operations. This operation is much slower on the Half type.

public unsafe static float BFloat16ToSingle(ushort bfloat16)
{
    int f32Value  = 
        (bfloat16 & 0x8000) << 16 |                      // sign bit
        ((bfloat16 & 0x7FFF) + 0x1C000) << 13; // exponent and mantissa

    return *(float*)&f32Value;
}

ARM also provides the accelerated BFCVT function to convert a Single back to a Float16.

However, I agree they are non-essential.

MichalPetryka commented 9 months ago

I don't think mathematic functions should be implemented. They are likely not supported by hardware, nor required by any specification. The first version of Half in .NET 5 is only a transport type, with no IEEE754 function implemented.

I think it's worth noting that proposed API surface isn't necessarily the one that's initially implemented as it was noted in #81376. As such, I think that unless the decision would be to never add the full set of operations (which seems unlikely since hardware is already starting to expose them), API review should see the final surface during review, even if its implementation would be partial initially.

Let me know if you spot any other missing public members.

Diffing with Half seems to still show some missing members.

colejohnson66 commented 9 months ago

public unsafe static float BFloat16ToSingle(ushort bfloat16)
{
    int f32Value  = 
        (bfloat16 & 0x8000) << 16 |                      // sign bit
        ((bfloat16 & 0x7FFF) + 0x1C000) << 13; // exponent and mantissa

    return *(float*)&f32Value;
}

This seems like needlessly complicated to read, and generates worse codegen than is needed. A bfloat16 is just a truncated binary32:

public static float BFloat16ToBinary32(ushort value)
{
    uint temp = (uint)value << 16;
    return Unsafe.As<uint, float>(ref temp);
}

tannergooding commented 9 months ago

API review should see the final surface during review

This isn't important to API review. The potential for operators to be added later is generally not a major consideration in the exposure of a type. We almost never know the "full" surface area, and while it might be relevant to consider whether additional APIs are planned, they really only limit the ability to cleanly implement/expose the initial surface.

This type is not really a core/common type and isn't even strictly "well spec'd" in the same way the IEEE 754 types are. It likely should exist in the System.Numerics namespace (much as the new Decimal32/64/128 types will be).

It should initially only cover itself as a minimal interchange type with the relevant conversion APIs. That is going to be the 99% use case and is the only case that will be hardware accelerated for the near future. I'm fine with separately considering the expansion of this to support the full set of IBinaryFloatingPointIeee754<T> members, but that should be split out and separate from the mainline consideration. Such members would only be convenience APIs for upcast to float, do the operation, downcast to bfloat after all and in many cases would be the less efficient way to operate on the data (typical usage in AI/ML/GPU is to upcast a vector's worth of these values, operate on them as float end to end, and then downcast when storing back to memory/disk).

A bfloat16 is just a truncated binary32

Notably this is not universally true. It was initially introduced using truncation, but there are a number of different hardware implementations nowadays and some use ties to even (IEEE 754 default, which Google TPU uses) or round to odd (ARM), etc.

We should likely default to truncation, but its possible we need additional APIs to support other rounding modes.

iamcarbon commented 9 months ago

@tannergooding Thanks for the comments! I update the proposal to use the System.Numerics namespace and scaled back the surface area to be used as a minimal interchange type.

tannergooding commented 9 months ago

These should notably be properties since its a trivial constant over a value type and can avoid the static initializer:

public static BFloat16 Epsilon { get; } public static BFloat16 MinValue { get; } public static BFloat16 MaxValue { get; }

We also need the conversion from double for parity

public static explicit operator BFloat16(double value);

colejohnson66 commented 9 months ago

Does it make sense to require explicit upcasting to float and double as all bfloat16s are perfectly representable as binary32 and binary64?

tannergooding commented 9 months ago

Implicit casts can introduce potential versioning concerns and so it depends a bit. It will likely be a discussion point in the API review.

bartonjs commented 7 months ago

Video

Looks good as proposed. Also with whatever level of generic math (and public visibility thereof) is appropriate. (IFloatingPointIeee754<BFloat16>, most probably)

namespace System.Numerics
{
    public readonly struct BFloat16
      : IComparable,
        IComparable<BFloat16>,
        IEquatable<BFloat16>
    {
        public static BFloat16 Epsilon { get; }
        public static BFloat16 MinValue { get; }
        public static BFloat16 MaxValue { get; }

        // Casting
        public static explicit operator BFloat16(float value);
        public static explicit operator BFloat16(double value);
        public static explicit operator float(BFloat16 value);
        public static explicit operator double(BFloat16 value);

        // Comparison
        public int CompareTo(object value);
        public int CompareTo(BFloat16 value);
        public static bool operator ==(BFloat16 left, BFloat16 right);
        public static bool operator !=(BFloat16 left, BFloat16 right);
        public static bool operator <(BFloat16 left, BFloat16 right);
        public static bool operator >(BFloat16 left, BFloat16 right);
        public static bool operator <=(BFloat16 left, BFloat16 right);
        public static bool operator >=(BFloat16 left, BFloat16 right);

        // Equality
        public bool Equals(BFloat16 obj);
        public override bool Equals(object? obj);
        public override int GetHashCode();

        // ToString override
        public override string ToString();
    }
}

huoyaoyuan commented 7 months ago

Which assembly should it belong to? Should it be in S.R.Numerics like Complex?

Since there are hardware acceleration for it, it should likely be in CoreLib.

Neme12 commented 6 months ago

Shouldn't it be called BHalf, since there's Half, Single & Double as opposed to Float16, Float32 and Float64?

Neme12 commented 6 months ago

        // Casting
        public static explicit operator BFloat16(float value);
        public static explicit operator BFloat16(double value);
        public static explicit operator float(BFloat16 value);
        public static explicit operator double(BFloat16 value);

Correct me if I'm wrong, but isn't it the case that every BFloat16 can be losslessly converted to a float and double? If so, why aren't those conversions implicit? Also, aren't conversions to and from Half needed as well?

Neme12 commented 6 months ago

Also, for those conversion that are not lossless, shouldn't there be checked and unchecked versions?

Neme12 commented 6 months ago

Implicit casts can introduce potential versioning concerns and so it depends a bit. It will likely be a discussion point in the API review.

What are those versioning concerns? It's a little unfortunate to have those be explicit not only because you have to add a cast, but because the conversion being explicit makes me (and I assume others as well) think that it cannot be safely converted, when in fact it can be. It's really counterintuitive for them to be explicit.

tannergooding commented 6 months ago

Shouldn't it be called BHalf, since there's Half, Single & Double as opposed to Float16, Float32 and Float64?

No, the industry standard names for the types are BFloat16, Half, Single, and Double. The "spec" names are brain float16, binary16, binary32, and binary64

Also, for those conversion that are not lossless, shouldn't there be checked and unchecked versions?

Checked vs unchecked normally only exist where a conversion can throw. Floating-point conversions never throw and have 1 strictly defined behavior, which is round to nearest representable.

You theoretically could expose the optional IEEE 754 support for raising an "inexact exception", but that throws for almost every operation you can imagine, even 1 / 10 or 0.1 + 0.2 results in an inexact result (even when accounting for the actual underlying values represented not being 0.1 and 0.2).

What are those versioning concerns?

Language primitive types get special handling and precedence for conversions. There are many cases where this can negatively impact overload resolution either by new ambiguities caused by new implicit conversions or by the wrong overload being silently selected.

A simple example is if you have double M(double x) and call double x = M(5) it will call the only overload. However, if you then expose float M(float x), the call will now call the overload that takes float and silently upcast the result back to double, so not only do you have a change in precision (which for large int is potentially lossy when cast to float), but it is a silent change in precision due to the upcast of the float result back to double.

Similar issues exist when introducing new APIs around Half or BFloat16 where they have implicit conversions to float (or other primitive types) and especially if they have any implicit conversions from other primitive types. For that reason, we explicitly made the casts on Half explicit and made a similar decision for float, as it avoids an entire class of issues and helps make the operation that much more explicit.

Neme12 commented 6 months ago

Checked vs unchecked normally only exist where a conversion can throw. Floating-point conversions never throw and have 1 strictly defined behavior, which is round to nearest representable.

Wait, uh? 😟 I assumed until now that in a checked context, if I cast a numeric type and the value can't fit into the new type, it throws. Now I could have bugs in my code I guess :/ But thanks for letting me know.

A simple example is if you have double M(double x) and call double x = M(5) it will call the only overload. However, if you then expose float M(float x), the call will now call the overload that takes float and silently upcast the result back to double, so not only do you have a change in precision (which for large int is potentially lossy when cast to float), but it is a silent change in precision due to the upcast of the float result back to double.

This seems like an argument against all implicit conversions altogether. But the language has them and people are used to them. So it seems weird that some numeric types would have them and others would not, for a reason that applies to all of them.

If they were really so bad, why would they exist in the language? For one reason or another, they made the call about them existing and about numeric types having them. So I feel like we should follow that to be consistent. I get the argument about being explicit about things, but it's still weird for them to be explicit as it makes me think wait, this is dangerous and I have to have extra scrutiny here as there can be either an exception or a loss of precision due to an explicit cast. When in fact there can't be and it's completely safe. I wish there was a special syntax for conversions that made you be explicit about them, just like explicit conversions, but would only allow conversions that are "implicit"/safe. But there isn't :( For better or worse, we have what we have in the language, but people (including me) have gotten used to what we have so I still feel like there should be consistency instead of banishing certain language features that we don't like for new code, even though they're used all over the place in existing code and will always be as they'll always be implicit conversions for the builtin types and other existing types that have them, and they'll always be this weird inconsistency that makes people stop and wonder why it's there. I just associate explicit conversions with conversions that aren't safe, because if they were safe, they would be implicit - that's the way it has always been (apart from that one mistake of int and float).

Neme12 commented 6 months ago

If this is really the decision for all conversions to be explicit going forward regardless of whether they're safe or not, please, at least add doc comments and documentation pages for those conversions saying whether they are actually safe or not.

tannergooding commented 6 months ago

Wait, uh? 😟 I assumed until now that in a checked context, if I cast a numeric type and the value can't fit into the new type, it throws. Now I could have bugs in my code I guess :/ But thanks for letting me know.

Checked has always really pertained to overflow/underflow and not necessarily towards "representable". The simplest example is that checked(5 / 2) does not throw, it simply returns 2 even though the actual answer of 2.5 is not representable.

Likewise checked((float)double.MaxValue) does not throw because the specification requires it take the value as given, perform the operation as if to infinite precision and unbounded range, and then round to the nearest representable result. For float, this happens to be PositiveInfinity which is a representable value and therefore it does not throw.

floating-point to integer conversions do throw for checked if the value can't be represented, as that would overflow. Integer to floating-point conversions do not, even though many inputs will result in a loss of precision.

This seems like an argument against all implicit conversions altogether.

In some ways, yes. There are many languages that explicitly do not provide implicit conversions because of these issues.

But the language has them and people are used to them. So it seems weird that some numeric types would have them and others would not, for a reason that applies to all of them.

Yes, and so our decision on whether to use implicit conversions or not is based around the likelihood people will run into issues/pits of failure.

There are many cases where implicit conversions are good and where we would expose them for new types; this just doesn't happen to be one of them due to it being a more esoteric user-defined type that needs to interplay with multiple built-in types (which have special conversion precedence rules) and being used in scenarios where a new overload causing a silent loss of precision could be both easily missed and have a large negative impact were it to make it production.

That is to say, we don't only make the decision to expose implicit conversions based on whether or not something is lossless. We have to also account for how that is likely to be used or impact other existing overloads, especially for more common types, and how likely it is to be exposed as an overload for those other types. This case has both of those as fairly likely, especially in domains where the combination of perf and precision are often competing with eachother.

We can always expose the implicit conversions later given enough feedback, but we can't take them away once they are exposed. So defaulting to explicit here is the better/safer option and won't be overly negative, particularly given the primary domains are going to involve using vectors and require explicit conversions anyways.

Neme12 commented 6 months ago

Checked has always really pertained to overflow/underflow and not necessarily towards "representable". The simplest example is that checked(5 / 2) does not throw, it simply returns 2 even though the actual answer of 2.5 is not representable.

Right, but I would consider converting an int that's outside of the range of short, to short, to be an overflow. Isn't it?

The simplest example is that checked(5 / 2) does not throw

I guess I wouldn't consider that to be an overflow, I wouldn't expect that to throw as that's what integer division is defined as. But I would consider casting a double to a float that's too large for a float to be a kind of overflow.

But thanks for letting me know about the semantics (or lack of thereof) of checked and floating point numbers. I guess I have to be careful and write my own utilities for floating point conversions that are really actually checked.

EDIT: Oh, float.CreateChecked isn't checked either 😲 damn.

tannergooding commented 6 months ago

I guess I wouldn't consider that to be an overflow, I wouldn't expect that to throw as that's what integer division is defined as. But I would consider casting a double to a float that's too large for a float to be a kind of overflow.

Of sorts, but that's the intent of PositiveInfinity and NegativeInfinity. They exist to represent a value that overflowed past the finite range. Its overall more performant, avoids needing to check every single operation while still propagating the relevant information such that checking once at the end of the algorithm is typically sufficient instead. And most importantly, it allows float/double to represent values and arithmetic operations that are critical for scientific applications, games, machine learning, and in general higher level mathematics. -- NaN and Negative Zero exist for much the same reason, to represent values that escape the "real" number domain or which round towards zero, but may have actually been less than Epsilon.

It really just falls out that there is no value that can overflow, because its always representable as infinity, which is unlike integers which can only represent finite values.

dotnet / runtime