BFLOAT16 support - Githubissues

daurnimator commented 5 years ago

BFLOAT16 is a new floating-point format. It's a 16-bit floating point format with an 8 bit exponent and 7 bit mantissa (vs 5 bit exponent, 11 bit mantissa of a half-precision float which is currently f16) designed for deep learning.

The bfloat16 format is utilized in upcoming Intel AI processors, such as Nervana NNP-L1000, Xeon processors, and Intel FPGAs, Google Cloud TPUs, and TensorFlow. Arm Neon and SVE also supports bfloat16 format.

Selected excerpts:

Rust proposal is to call the type f16b.
should always have size 2 and alignment 2 on all platforms

References:

As a more general issue: how should we add new numeric types going forward? e.g. Unum. With zig not supporting operator overloading, such types would have to be provided by the core for ergonomic use.

msingle commented 4 years ago

Also .NET 5 will have Half types

marnix commented 4 years ago

As a type naming proposal, perhaps f16_7, so use the mantissa/fraction number of bits? Rationale: Less precision -> lower number.

Short name	Long name	Description
f16	f16_10	IEEE half-precision 16-bit float / .NET Half type
f32	f32_23	IEEE 754 single-precision 32-bit float
f64	f64_52	IEEE 754 double-precision 64-bit float
(none?)	f16_7	bfloat16
?	f19_10	NVidia's TensorFloat
?	f24_16	AMD's fp24 format

tgschultz commented 4 years ago

We could do what we do with integer types and allow the creation of arbitrary exponent/mantissa bitcount float types on demand.

daurnimator commented 4 years ago

Apparently ARM Neoverse v1 will be getting BFLOAT16 support: https://fuse.wikichip.org/news/4564/arm-updates-its-neoverse-roadmap-new-bfloat16-sve-support/

Mouvedia commented 4 years ago

If you do, also add BFLOAT19 AKA TF32. If we are following rust naming convention that would be f19b.

zigazeljko commented 4 years ago

LLVM 11 added support for bfloat16: https://llvm.org/docs/LangRef.html#floating-point-types

ziglang / zig

BFLOAT16 support #3148