webmachinelearning / model-loader

🧪 Model Loader API
https://webmachinelearning.github.io/model-loader/
Other
30 stars 10 forks source link

Open question: support for non-IEEE 754 float point types #23

Open wacky6 opened 2 years ago

wacky6 commented 2 years ago

Relates to https://github.com/webmachinelearning/webnn/issues/252

Some accelerators use non-standard float point types (e.g. bfloat16 and TF32). They are important to achieve high performance (e.g. by using Nvidia's tensor cores), and/or reduce resource usage (e.g. FP32->FP16 reduces memory usage by half).

How could MLLoader leverage these types? Some ideas:

josephrocca commented 2 years ago

Another factor is download time. IIUC, the current tfjs format (for example) doesn't support float16, and so tfjs-converter converts weights to float32. This isn't ideal because it doubles the model size. I think it makes more sense to always optimistically serve the model in its "native" floating point format and for conversion to be done at run time based on the device's hardware.