Open w2016561536 opened 5 months ago
Hi @w2016561536 thank you for reporting the issue. Is there some way to reproduce this issue just creating a simple test on NuttX mainline, without using all these IMU files from PX4? If you can isolate the issue it will help us to find the root cause. Just for awareness ping @tmedicci
Well, it seems to be difficult to reproduce. But I perhaps find the problem. File https://github.com/apache/nuttx/blob/b09b429308b991ba455cad57b53e0abaa423bf53/arch/xtensa/src/common/xtensa_user_handler.S#L363C1-L363C23, we can find that this function does not correctly implemented. In FreeRTOS, implemention is https://github.com/espressif/esp-idf/blob/cadf80e8751caffaf25207a12bb65e5b188683ae/components/freertos/FreeRTOS-Kernel/portable/xtensa/xtensa_vectors.S#L990. And this funtion has a related issue: https://github.com/espressif/esp-idf/issues/11690 , very similar to this issue
@tmedicci Do you think this problem is caused by fpu ?
@tmedicci Do you think this problem is caused by fpu ?
Hi @w2016561536, I am not aware of it. Maybe, you could try to implement the workaround and I can evaluate using our internal CI.
@w2016561536 did you try to save the FP registers?
If it fixes the issue we should include it into mainline. Maybe wrapped by #ifdef CONFIG_ARCH_FPU
Hey guys, I am the reporter of the original problem in ESP-IDF FreeRTOS. Yes, this is a silent data corruption and the current ESP-IDF's interrupt vector assembly file has a fix. Regardless of whether this specific issue is caused by the same bug (very likely), you should update the vectors to match upstream :)
@ProfFan could you point out the patch? we can apply the change, thanks.
i guess he meant this one. https://github.com/espressif/esp-idf/issues/11690
but the change look like FreeRTOS specific: https://github.com/espressif/esp-idf/commit/b03c8912c73fa59061d97a2f5fd5acddcc3fa356#diff-db429b5abb80b87b6da1abb1ecd103c81fc2d982780bd8a3f1a23494b1749155R1152
Perhaps this bug needs Espressif staff to work on.
@fdcavalcanti @eren-terzioglu @tmedicci please take a look ^
The FPU vs non-FPU can be checked by disabling the FPU and trying if the issue will be reproduced with integer emulated math libs
I'm sorry, @xiaoxiang781216 , the issue ID that https://github.com/apache/nuttx/pull/14481 solves is different. I already fixed it. I'm sorry.
@w2016561536 maybe we can work together to get PX4 working on ESP32, ESP32-S2 and ESP32-S3. @henrykotze is working on PX4 for ESP32 and I want to run NuttX on ESP32-S2 to run on this device:
@w2016561536 maybe we can work together to get PX4 working on ESP32, ESP32-S2 and ESP32-S3. @henrykotze is working on PX4 for ESP32 and I want to run NuttX on ESP32-S2 to run on this device:
Good idea! But I think fpu is necessary for this complex task, however esp32-s2 doesn't have. I have tried to port PX4 for esp32s3 and uploaded to https://github.com/w2016561536/PX4-Autopilot/tree/px4_esp32s3 And here, Guanglun has finished PX4 for esp32 https://github.com/guanglun/PX4-Autopilot/tree/single_core_esp32
Perhaps this thing leads to fpu problem? https://github.com/apache/nuttx/blob/0c5381a0a15f992d8d0cdca9e9c6ac6682176f42/arch/xtensa/src/esp32s3/esp32s3_i2c.c#L1379 Too many tasks in irq stack. I have tried to use i2c poll mode instead and this problem seems to disappear.
Hi, I was running PX4 based on NuttX on esp32s3 and found an error. a float data will be nan after a simple multiplication. And the console output: This problem will appear after booting for several minutes. Moreover, some function calling will make the same thing, making the float varible Nan.
defconfig:
Hardware: ESP32S3-WROOM-1 M0N16R8
NuttX version: 12.4 , commit : 0f169f50c4b234abde12a6a0b028a8fe8f62f5aa
Full source code: https://1drv.ms/u/c/008ed313fdaa343c/EaXGLgJs_3VLpahnyyVtaL4BgF3pUIa_6f1XHX_ZxOb-Ow?e=1iSk8w