Abnormal results from optimized 'elementwise add' operator

Unbinilium commented 1 year ago

Hello, I've noticed that after enabling the acceleration option for esp-nn, the elementwise add operator is returning abnormal results in some cases. Based on various test results, I personally suspect that there might be alignment issues with the stack usage of the elementwise add operation. Since I'm not familiar with the Xtensa architecture and ESP EE instruction set, I raise this issue seeking for help.

Environment:

esp-idf v5.1 cbce221 and esp-idf v4.4.5 ac5d805
esp-nn 1a35708 and 6b3ef8e
esp32s3

Reproduce Details:

The abnormal behavior is generally observed when the acceleration is enabled. After the elementwise add operation, the output tensor's value in one of its dimensions becomes unusual in certain cases (appearing smaller, resembling an additional >> 1 operation).
Modifying external code which affects stack allocation, such as changing the format string (adding or removing a few characters) or defining a variable or constant which is not used in the program, can lead to the mentioned issue.
Once this problem occurs, every inference result will exhibit abnormal behavior unless the code is modified and the firmware.
When using the ANSI C version of the elementwise add operator or manually optimized assembly obtained from ANSI C, the results are normal, regardless of whether other operator optimizations are enabled. The abnormal behavior described above does not appear.

Additional Information:

We are fairly certain that the issue is not caused by stack overflow or external code-related problems.
Due to the unpredictable nature of reproduction, we haven't been able to provide a stable minimum reproducible code.

If you need any other information that might be helpful, I'll try to provide, thanks.

vikramdattu commented 1 year ago

Hello @Unbinilium thanks for reporting the issue. Can you please confirm for which particular input size/s to the depth wise the issue occurs? Is it possible to share the model (or a smaller/untrained model if that's not possible) along with changes in the code, which can be used to re-produce the issue?

Thanks in advance.

Unbinilium commented 1 year ago

@vikramdattu Hi, sorry the name of the add operator I mentioned earlier should be corrected to elementwise add, it located in the basic_math folder of esp-nn.

And our model is a single-class YOLO object detection model (model.zip), quantified to int8 in a TFLite flatbuffer. It takes input of [192, 192, 3] and produces an output of [1, 2268, 6]. The issue we are observing most directly is that in the [1, 2268, :1] channel, some value seems to be smaller than normal, while values in the other channels appear to be normal.

However, the actual situation might be more complex since there are other add operations before obtaining this value. What I can confirm is that everything is normal when using the non-accelerated ANSI C version elementwise add.

Hope the above information will help you find the issue.

vikramdattu commented 1 year ago

Hi @Unbinilium thank you agin for the detailed input. Please find the patch attached, which skips optimizations for add and mul when inputs are unaligned. In future, we will think of handling un-aligned cases as well for these two. But for now, since, these two ops are less often used and take up, only a % of total model, the path to selectively enabling vector optimization is taken.

0001-Bugfix-elementwise-add-mul-mismatch-when-unaligned-i.patch

Do share your observations after applying the patch.

Unbinilium commented 1 year ago

Thanks for your timely reply. I will test whether the patch solves the problem as soon as possible.

Unbinilium commented 1 year ago

After applied the patch, we tested in some cases and found that the issue we previously encountered no longer appears. Thanks again for your prompt patch.

espressif / esp-nn

Abnormal results from optimized 'elementwise add' operator #6