Open rainyl opened 2 months ago
Summary: This issue proposes adding Float16
support to dart:ffi
to enable direct interaction with native libraries that use Float16
, improving efficiency and convenience for developers working with AI-related computations.
It's also technically possible to support a Float16List
in dart:typed_data
, but if there are no operations to convert single 16-bit floats to double
(a very quick check suggests that to be the case at least for intel/AMD CPUs), reading and writing would likely not be as efficient as expected. (A "clever" implementation may convert a number of values at a time, and cache the results, so consecutive reading can be optimized. Writing is harder.)
What about just make float16List as an alias of Uint16List, which means a float16List is actually stored as Uint16List, but convert to dart double when getting values and convert to Uint16 when setting values?
but if there are no operations to convert single 16-bit floats to
double
(a very quick check suggests that to be the case at least for intel/AMD CPUs), reading and writing would likely not be as efficient as expected.
I did find this:
I'm not sure if using machine instructions from float16->float32->float64 is slower or faster than going from float16->float64 manually.
If users need this and are going to manually use slow conversions to doubles anyway, then we might as well make their lives easier and add it in dart:ffi
and dart:typed_data
.
@rainyl do you want to use the float16's as double
s in Dart? or are your use cases only about efficiently shuffling bytes around?
What about just make float16List as an alias of Uint16List, which means a float16List is actually stored as Uint16List, but convert to dart double when getting values and convert to Uint16 when setting values?
Any XXXList is stored as bytes and only converted when reading/writing values! 😄
If you just want to shuffle bytes around, you don't want to do it via something that requires conversions when reading writing. So you'd want to use setRange
on a TypedData
with another TypedData
that has the same element type, so that it can be a memcpy
.
If we add Half
as a valid float NativeType
, then we also need to implement it in FFI calls. It looks like for not all ABIs this is well-defined:
So it might be tricky to fully add Half
everywhere in dart:ffi
. (Though I'd definitely be open to someone trying.)
cc @mraleph @mkustermann @rmacnak-google
do you want to use the float16's as doubles in Dart? or are your use cases only about efficiently shuffling bytes around?
Both, the ideal use cases are very similar to other native types, but the most important for my project now is creating a view of Float16List and providing a proper way to get/set values. Now I can regard Uint16List as Float16List, but it's not elegant if users want to get/set values.
// Currently I interact with native float16 using:
final ffi.Pointer<ffi.Uint16> ptr = ...;
final Uint16List view = ptr.asTypedList(length);
// Without Float16List, users have to set/get values via:
final double val = fp16_int_to_double(view[0]);
view[0] = fp16_double_to_int(val);
// `fp16_int_to_double` and `fp16_double_to_int` are implemented referring to https://github.com/opencv/opencv/blob/71d3237a093b60a27601c20e9ee6c3e52154e8b1/modules/core/include/opencv2/core/cvdef.h#L828-L917
// It will be user-friendly if users can set/get using dart double directly, maybe some thing like:
final ffi.Pointer<ffi.Float16> ptr = ...;
final Float16List view = ptr.asTypedList(length);
// With Float16List, users can set/get values via:
final double val = view[0];
view[0] = val;
Any XXXList is stored as bytes and only converted when reading/writing values! 😄
Sounds like easy to implement the above operations for Float16List, good news.
If we add Half as a valid float NativeType, then we also need to implement it in FFI calls. It looks like for not all ABIs this is well-defined:
Yes, but maybe the implementation of opencv can be a reference? It defined a hfloat
and use it's own implementation if __fp16
is not defined, otherwise use __fp16
https://github.com/opencv/opencv/blob/71d3237a093b60a27601c20e9ee6c3e52154e8b1/modules/core/include/opencv2/core/cvdef.h#L384-L399
final ffi.Pointer<ffi.Float16> ptr = ...; final Float16List view = ptr.asTypedList(length); // With Float16List, users can set/get values via: final double val = view[0]; view[0] = val;
Happy to receive a PR for this!
Should it be Half
or Float16
? We call the other thing Double
. 😄
A PR for only this should add errors on Half
s in FFI calls and callbacks.
Yes, but maybe the implementation of opencv can be a reference? It defined a
hfloat
and use it's own implementation if__fp16
is not defined, otherwise use__fp16
ushort
hehe so it's an uint16 if it's not available.
Well, Dart is not compiled at the same time as your library that uses open-cv, so we risk compiling with different flags which will lead to segfaults. On the other hand, we also assume SoftFP on Android arm32 and hard fp on arm32 Linux. Technically there can be Androids out there with hardfp and linuxes with softfp, but we've not run into them.
I'd be fine simply assuming the type is defined (except for risc-v).
I'm also open for getting a PR for adding this. This PR will be much more involved, as it includes getting the calling conventions right.
If you want to work on these PRs I can provide pointers for where to start.
I would like to suggest Float16
instead of Half
.
Should it be Half or Float16? We call the other thing Double. 😄
Same as @Wdestroier , I like Float16 too.
If you want to work on these PRs I can provide pointers for where to start.
Sure, I am willing to work on this when having some free time, so could you please provide some instructions? So that other developers can work on this too. 😄
@sigmundch @mkustermann can dart:typed_data
Float16List
be properly supported on dart2js and dart2wasm? (We can of course always fall back to an implementation that does the bit-shuffling in Dart, but that might not be desirable for performance reasons.)
For adding Float16List
:
CLASS_LIST_TYPED_DATA
external double _getFloat32(int offsetInBytes);
and the setter.TypedList_GetFloat32
FlowGraphBuilder::BuildTypedListGet
EmitNativeCode
.external
function in the patch file, just load 16 bits via a Uint16List
and do the conversion in Dart. This is probably an easier implementation to start with and will perform considerably worse. But it's maybe worth doing that as the first PR before diving in to generating machine code.For adding support for Pointer<Float16>
, Array<Float16>
and Float16
s in structs/unions; and error messages on using Float16
in FFI calls and callbacks:
Float16
inside structs add a float16 type to tests/ffi/generator/c_types.dart and some tests with structs to tests/ffi/generator/structs_by_value_tests_configuration.dart -> try to especially cover cases where alignment could be unexpected. For example a struct with first a uint8_t
and then a float16.asTypedList
you can add a new test in tests/ffi/.If the rest of the Dart team is in favor of adding this, my suggestion would be to split this work up in multiple PRs:
Float16List
Float16List
get and set with recognized methods that target assembly instructions for float16 conversionsPointer<Float16>
and Float16
inside structs (but rejecting Float16
as FFI call/callback arguments and return value)Float16
as FFI call/callback arguments and return value. (I can provide pointers on how to do that later.)I don't think a Float16List
can be efficient in JavaScript, pendant not in Warm either of us not a built-in type.
And even on native, the smallest x64 operation performs four parallel conversations, not just a single one.
That means bit-shuffling in JS and Wasm, possibly on native too.
Reading is fairly simple, it's one sign bit, 5 bit exponent, 10 bit mantissa. A 64 entry lookup table for the exponent + sign will probably work.
Writing worries me more. Bit-fiddling on doubles requires first getting the bits of the double, which Dart doesn't support directly. Then it needs some rounding rules. The input is bigger than for reading, so a table isn't useful.
Native will almost certainly use the SIMD operation for each value. Everybody else will have to do something more expensive.
I'm not sure bad support is better than no support.
(We'll probably also want a Float16x8
type and list of those.)
(We'll probably also want a
Float16x8
type and list of those.)
➕ I was thinking about that too.
AI/ML models can use different 16-bit floating point number formats, most commonly IEEE float16 and bfloat16 (which has more exponent bits). I
So if the reason is AI/ML it would make sense to extend the discussion to be
=> dart:ffi
: Pointer<Float16>
& Pointer<BFloat16>
=> dart:typed_data
: Float16List
& BFloat16List
Those two are somewhat separate and can be discussed separately (e.g. we support Pointer<Bool>
in dart:ffi
without having an equivalent list type in dart:typed_data
).
For dart:typed_data
it may be tricky as JavaScript doesn't have equivalent typed arrays and dart2js would dynamically need to keep track of the type (which may be problematic, see e.g. recent deprecation & removal of UnmodifiableUint8List
/... classes). @rakudrama wdyt?
For dart:ffi
we'd need to think to what extend we want to support it: Allowing it indirectly via Pointer
with appropriate double operator [](int index)
void operator[]=(int index, double value)
is probably the most common use and uncontroversial. Though allowing them as Struct
members or primtivies is more tricky as we'd need to have ABI support and it's not part of standard C and it seems some ABIs may not support it.
=> The only real use may? be via Pointer<>
usage, so we could restrict it's usage to that
=> Our compiler would then generate very efficient code for the conversion to/from double
But if the only use is via Pointer
, we have to think whether it's actually needed to have this support as part of dart:ffi
. Let's say we model this as extension types in a helper package (e.g. in package:ffi/bfloat16.dart
):
import 'dart:ffi';
extension type BFloat16P(Pointer<Uint16> pointer) {
double operator [](int index) {
final int value = pointer.value;
// ... code to bfloat16->double ... (XXX)
return convertedValue;
}
void operator []=(int index, double value) {
// ... code to double->bfloat16 ... (XXX)
return convertedValue;
}
BFloat16List asTypedList(int length) => BFloat16List(this, length);
}
class BFloat16List implements List<double> {
final BFloat16P pointer;
final int length;
BFloat16List(this.pointer, this.length);
double operator [](int index) => pointer[index];
void setRange(...) {
// Would e.g. delegate to already optimized `pointer.asTypedList().setRange()`
}
}
And then users can use it via
@Native<Pointer<Uint16> Function()>()
external Pointer<Uint16> getTensor();
main() {
final BFloat16List tensor = BFloat16P(getTensor()).asTypedList(64);
for (int i = 0; i < tensor.length; ++i) {
print(tensor[i]);
}
}
or if we allow convenience usage of extension types in FFI:
@Native<BFloat16P Function()>()
external BFloat16P getTensor();
main() {
final BFloat16List tensor = getTensor().asTypedList(64);
for (int i = 0; i < tensor.length; ++i) {
print(tensor[i]);
}
}
We could ensure the conversion code in (XXX)
is written in a way that allows our compilers to generate very efficient code for it (possibly even recognizing the specific conversion pattern & optimizing via built-in HW support).
@rainyl Would your use case be solved by this?
or if we allow convenience usage of extension types in FFI:
👍 Tracked in:
=> Our compiler would then generate very efficient code for the conversion to/from
double
It would be even more efficient with Float16x8
. But so far we've been only doing that for Float32x4
via typed_data
. So if we wanted to allow that and not add it to typed data we should maybe consider having such Float16x8
in dart:ffi
? (But I guess no support for BFloat16x8
, I haven't seen any assembly instructions tailored to that yet.)
I'd be cautious adding Float16Pointer
as an extension type in for example package:ffi
if we would consider adding Float16x8
later in the Dart SDK. Moving types between a package and dart:
libs is next to impossible.
Would your use case be solved by this?
Yes, I am working on opencv bindings for dart, so I have to get a view
of the pixel values at (x, y) to read and change the values, I believe your method will work.
@dcharkes RISC-V has a ratified extension, Zfh, but the major Linux distributions don't include it in their baseline. AFAIK, Android and Fuchsia haven't chosen their baseline yet, but Zfhmin is part of the RVA22 profile, so I expect they will include it.
Currently, only
Float (Float32)
andDouble (Float64)
are introduced indart:ffi
. However, the application ofFloat16
is becoming more and more widespread, especially for AI-releated computations, and it is very inconvenient when interacting to native libraries that supports Float16, developers have to access the fp16 pointers or values usingUint16
and write the convension menthods by themselves, even so, some methods likeUint8List.view()
are not possible forfp16
if developers want to return a float16 view instead of a copy.I have read #52250 and #51994, but both of them are talking about more specific primitive types for dart lang, however this issue just for dart:ffi.