hollance / neural-engine

Everything we actually know about the Apple Neural Engine (ANE)
MIT License
2.03k stars 75 forks source link

About the description of allowLowPrecisionAccumulationOnGPU #8

Closed y-ich closed 3 years ago

y-ich commented 3 years ago

Hi.

In https://github.com/hollance/neural-engine/blob/master/docs/16-bit.md, you wrote

"On the GPU it uses float16 for the weights and the intermediate tensors, but float32 for the calculations. You can turn this off with the option allowLowPrecisionAccumulationOnGPU from MLModelConfiguration, in which case the GPU also uses float16 for the calculations. This is a bit faster but you may lose precision."

Do you have any reference for this description?

In WWDC 19 (https://developer.apple.com/videos/play/wwdc2019/704/ (39:00)), they said,

"And the idea here is that if your model is learning on the GPU, instead of doing accumulation in float32, that happens in float60."

So I guess that this option may be effective only for macOS and the change is from float60 to float32.

I tried the option on iOS device without Neural Engine, but it seemed no speed enhancement.

Thanks.

y-ich commented 3 years ago

Now I is aware that float60 is typo of transcript^^;