fastmachinelearning / hls4ml

Machine learning on FPGAs using HLS
https://fastmachinelearning.org/hls4ml
Apache License 2.0
1.17k stars 388 forks source link

AveragePooling2D/GlobalAveragePooling2D Issue? #311

Closed jmduarte closed 2 years ago

jmduarte commented 3 years ago

I've been tracking down bugs related to this TinyML ResNet model, and you can see here (https://github.com/fastmachinelearning/platform_ml_models/blob/master/eembc/CIFAR10_ResNetv1/resnet_v1_eembc.py#L126), it uses an AveragePooling2D layer with the pool_size parameter set to the whole image, so effectively it's a GlobalAveragePooling2D layer.

For some reason, I'm unable to get this type of layer to give correct results in hls4ml. See this gist to reproduce the error: https://gist.github.com/jmduarte/2bb7ccb8c3028056ef3bdd7f2579250b

I create a random input tensor with shape [8, 8, 3], which after the (global) average pooling should have output values of:

[17.974936   9.814308   4.8630896]

The hls4ml output instead is (erroneously) the same for each channel

[0.875 0.875 0.875]

This is using ap_fixed<10,7,AP_SAT,AP_RND> precision, and io_stream, but varying these things doesn't fix the agreement. GlobalAveragePooling2D shows the exact same behavior. Curiously, MaxPooling2D and GlobalMaxPooling2D seem to be correct and don't have this issue.

jmduarte commented 3 years ago

OK, after discussing with @vloncar and @thesps offline, we narrowed down the issue to the fact that the reduce_pool function uses the same data type for the accumulator in average pooling as the data.

https://github.com/fastmachinelearning/hls4ml/blob/master/hls4ml/templates/vivado/nnet_utils/nnet_pooling_stream.h#L15-L26

This easily can cause an overflow if you sum over many data elements so it's better to use a wider data type for the accumulation in average pooling.

I have a "patch" solution using the accum_t parameter so I'll submit a PR and we can discuss further there.

vloncar commented 3 years ago

Ah, you beat me to it :wink: I was also finalizing a PR with the extra change being to set accum_t to layer's input precision if not set explicitly. Otherwise it may be confusing to use a default precision in fully quantized models.

jmduarte commented 3 years ago

Ah ok great! Feel free to close my PR (#314), if your change is better!

zhuangh commented 3 years ago

Ah, you beat me to it I was also finalizing a PR with the extra change being to set accum_t to layer's input precision if not set explicitly. Otherwise it may be confusing to use a default precision in fully quantized models.

hi @vloncar what is the strategy you are using to derive the accum_t size?

vloncar commented 3 years ago

@zhuangh nothing smart atm, just making sure the default is not counter-intuitive. Do you have some suggestions? Does QTools provide some help with the choice?

zhuangh commented 3 years ago

we want to put quantization constrain (let user specify) on the 1/pool_size. Then we could derive the average pooling accum etc.

jmduarte commented 2 years ago

Closing this issue, although there may be a new issue we can open regarding default choices of for accum_t for pooling layers