nod-ai / SHARK-Platform

SHARK Inference Modeling and Serving
Apache License 2.0
10 stars 21 forks source link

(shortfin) No support for taking f16 arrays from device #292

Open renxida opened 5 hours ago

renxida commented 5 hours ago

Problem: shortfin exports arrays to host via python's array.array which does not support float16. This is because python's array.array lib does not support float16 as a data type.

Supported types:

https://github.com/nod-ai/SHARK-Platform/blob/0c2e965c3ffe723db2fe2be9193c6d45fe558dbe/shortfin/python/array_binding.cc#L191-L205

Some messy reproducer code:

```python import shortfin as sf import shortfin.host import shortfin.array as sfnp import shortfin.amdgpu gpu = False if gpu: sc = sf.amdgpu.SystemBuilder() else: sc = sf.host.CPUSystemBuilder() lsys = sc.create_system() def fiber(lsys): return lsys.create_fiber() def device(fiber): return fiber.device(0) fiber = fiber(lsys) device = device(fiber) dtype = sfnp.float16 value = 3.14 ary = sfnp.device_array.for_host(fiber.device(0), [2, 4], dtype) with ary.map(discard=True) as m: m.fill(value) hary = ary.items print(hary) ```
renxida commented 5 hours ago

I'm currently doing this:

    add_type(DType::float16(), "H", sizeof(unsigned short)); // hack for dumping float16 values

And reinterpret-casting everything as float16s after i fetch them from device, but as @stellaraccident said, we probably don't want everyone doing this everywhere and we should have a more canonical way of doing this.

ScottTodd commented 3 hours ago

How much do you need to actually extract f16 from the program and interact with it from your user code? Is that for debugging? We generally try to keep weird data types internal to the model and baseline support is just for loading from a file into memory and then running kernels with that data type.