Open renxida opened 5 hours ago
I'm currently doing this:
add_type(DType::float16(), "H", sizeof(unsigned short)); // hack for dumping float16 values
And reinterpret-casting everything as float16s after i fetch them from device, but as @stellaraccident said, we probably don't want everyone doing this everywhere and we should have a more canonical way of doing this.
How much do you need to actually extract f16 from the program and interact with it from your user code? Is that for debugging? We generally try to keep weird data types internal to the model and baseline support is just for loading from a file into memory and then running kernels with that data type.
Problem: shortfin exports arrays to host via python's array.array which does not support float16. This is because python's array.array lib does not support float16 as a data type.
Supported types:
https://github.com/nod-ai/SHARK-Platform/blob/0c2e965c3ffe723db2fe2be9193c6d45fe558dbe/shortfin/python/array_binding.cc#L191-L205
Some messy reproducer code: