ARM-software / armnn

Arm NN ML Software. The code here is a read-only mirror of https://review.mlplatform.org/admin/repos/ml/armnn
https://developer.arm.com/products/processors/machine-learning/arm-nn
MIT License
1.14k stars 307 forks source link

ArmNN INT8 GPU performance isn't significantly better than FP16 #718

Closed srikris-sridhar closed 1 year ago

srikris-sridhar commented 1 year ago

I've tried running a really simple convolution (attached 2 models, one in FP32 and one in INT8). Here is what I see on an Samsung A33, Mali GPU G68. I see a good boost with INT8 on CPU but not GPU. Is this expected?

models.zip

matthewsloyanARM commented 1 year ago

Hi @srikris-sridhar,

Thank you for getting in touch. It's hard to know what might cause this. The Arm Compute CL backend (GPU) can take longer on the first iteration due to a warm up that it does. How many iterations are you running? If you set it to 10 for example, do you see an speed improvement on the second and following iterations?

Also, would you be able to supply us profiling data on your hardware with multiple iterations, so we can take a look at the specific kernels that are being run? If so, how are you running this model, so I can help with this further? Here are some general tips.

Thanks again!

Kind regards,

Matthew

srikris-sridhar commented 1 year ago

@matthewsloyanARM I've attached the models in the issue (see models.zip) so you should be able to reproduce it entirely. It's just a single 3x3 convolution. Are you able to reproduce this on your end?

I do run multiple iterations (~100) and chose the min time so it's likely not related to warm up.

TeresaARM commented 1 year ago

Hi @srikris-sridhar,

This issue is most likely to be related with Arm Compute Library than with Arm NN: https://github.com/ARM-software/ComputeLibrary/issues, try to open an issue on their side, I think they will be able to help you better than us.

Kind Regards

srikris-sridhar commented 1 year ago

I've filed an issue with ARM Compute Library.