EFeru / hoverboard-firmware-hack-FOC

With Field Oriented Control (FOC)
GNU General Public License v3.0
1.12k stars 932 forks source link

Board does not boot / beep anymore, flash works and verify also #389

Closed wlamers closed 1 year ago

wlamers commented 1 year ago

Variant

USART

Control type

FOC

Control mode

Speed

Description

Hi All,

My board is not working anymore after I changed the max current and a battery level settings. Flashing (and verification still works). I flashed back the original settings but still no luck. There is also no debug output (debug settings enabled on USART3). If I press the power button I hear the faint whining noise of the power supply. But it seems that the MCU hangs in reset mode or something like that. I tried 2 different batteries, again no luck/diffrence. Is this something that someone else experienced before? What can I do/test?

wlamers commented 1 year ago

For reference:

make flash st-flash --reset write build/hover.bin 0x8000000 st-flash 1.7.0 [!] send_recv read reply failed: LIBUSB_ERROR_TIMEOUT [!] send_recv STLINK_GET_VERSION 2023-03-17T21:32:49 INFO common.c: F1xx High-density: 64 KiB SRAM, 256 KiB flash in at least 2 KiB pages. file build/hover.bin md5 checksum: 80fc378b6afeb8f87b18e293e949dfb, stlink checksum: 0x003e724e 2023-03-17T21:32:49 INFO common.c: Attempting to write 41480 (0xa208) bytes to stm32 address: 134217728 (0x8000000) 2023-03-17T21:32:49 INFO common.c: Flash page at addr: 0x08000000 erased 2023-03-17T21:32:49 INFO common.c: Flash page at addr: 0x08000800 erased 2023-03-17T21:32:49 INFO common.c: Flash page at addr: 0x08001000 erased 2023-03-17T21:32:49 INFO common.c: Flash page at addr: 0x08001800 erased 2023-03-17T21:32:49 INFO common.c: Flash page at addr: 0x08002000 erased 2023-03-17T21:32:49 INFO common.c: Flash page at addr: 0x08002800 erased 2023-03-17T21:32:49 INFO common.c: Flash page at addr: 0x08003000 erased 2023-03-17T21:32:49 INFO common.c: Flash page at addr: 0x08003800 erased 2023-03-17T21:32:50 INFO common.c: Flash page at addr: 0x08004000 erased 2023-03-17T21:32:50 INFO common.c: Flash page at addr: 0x08004800 erased 2023-03-17T21:32:50 INFO common.c: Flash page at addr: 0x08005000 erased 2023-03-17T21:32:50 INFO common.c: Flash page at addr: 0x08005800 erased 2023-03-17T21:32:50 INFO common.c: Flash page at addr: 0x08006000 erased 2023-03-17T21:32:50 INFO common.c: Flash page at addr: 0x08006800 erased 2023-03-17T21:32:50 INFO common.c: Flash page at addr: 0x08007000 erased 2023-03-17T21:32:50 INFO common.c: Flash page at addr: 0x08007800 erased 2023-03-17T21:32:50 INFO common.c: Flash page at addr: 0x08008000 erased 2023-03-17T21:32:50 INFO common.c: Flash page at addr: 0x08008800 erased 2023-03-17T21:32:50 INFO common.c: Flash page at addr: 0x08009000 erased 2023-03-17T21:32:50 INFO common.c: Flash page at addr: 0x08009800 erased 2023-03-17T21:32:50 INFO common.c: Flash page at addr: 0x0800a000 erased 2023-03-17T21:32:50 INFO common.c: Finished erasing 21 pages of 2048 (0x800) bytes 2023-03-17T21:32:50 INFO common.c: Starting Flash write for VL/F0/F3/F1_XL 2023-03-17T21:32:50 INFO flash_loader.c: Successfully loaded flash loader in sram 2023-03-17T21:32:50 INFO flash_loader.c: Clear DFSR 2023-03-17T21:32:50 INFO flash_loader.c: Clear HFSR 21/ 21 pages written 2023-03-17T21:32:52 INFO common.c: Starting verification of write complete 2023-03-17T21:32:52 INFO common.c: Flash written and verified! jolly good! 2023-03-17T21:32:52 WARN common.c: NRST is not connected

Candas1 commented 1 year ago

https://github.com/EFeru/hoverboard-firmware-hack-FOC/wiki/Troubleshooting

wlamers commented 1 year ago

Thank you Candas1. I have indeed seen the troubleshooting section. The power supplies (15V, referring to the TIP127s, 5v and 3.3v) are all ok. I can even flash the MCU using the onboard power supplies. The problem is that the MCU does not boot after reset. It does get power (3.3v) but the firmware does not start. The boot1 pin (red LED) does also not turn on.

Any clue why the MCU does not boot?

Candas1 commented 1 year ago

Then it could be a hard fault because you did something wrong in the code, but not wrong enough for the compilation not to succeed. Use the latest code from the repository without changing anything but the variant.

RoboDurden commented 1 year ago

Try the original Niklas fauth (now lucys Rausch) firmware that only spins backwards and forwards for testing. @Candas1 , is it possible to debug these V1 boards with platformIO ?

Candas1 commented 1 year ago

Try the original Niklas fauth (now lucys Rausch) firmware that only spins backwards and forwards for testing. @Candas1 , is it possible to debug these V1 boards with platformIO ?

Yes I did it often

RoboDurden commented 1 year ago

Then @wlamers should be able to step into main(). If not even that happens I fear his board is broken :-/

wlamers commented 1 year ago

Ok, some updates...

After a long time debugging I have found the issue. I hope it helps someone else in the future.

I thought the MCU was faulty. Flashing and verifying works but the MCU did not boot (tried many diffent code bases, incl the original). So I swapped the MCU from another board with a heatgun. Also swapped 6 MOSFETS that had burned because of the firmware issue.

Now the funny part. When I build the same code on Ubuntu 20 it works!

I'm on Gentoo and boiled it down to the root cause. It was the latest newlib that caused the issue (specifically version 4.3.0.20230120). So I downgraded to 4.2.0.20211231, both build with 'nano'). Now it finally works!

For future reference, this combination of packages work:

The reason that I suspected a HW issue is that it did work before. I seem to have done a upgrade of my Gentoo system a few weeks ago, which had updated newlib. It did not remember that. So when I flashed the previous compiled version, no recompilation done (and hence periousliy comiled with the older newlib), my board just worked. Then I changed some minor params in the code and compiled (hence with the updated newlib) and flashed it. From that point onwords the misery started and I suspected a HW issue...

Mystery remains in what was causing the issue with the new newlib.

Thanks all for your input. And off course especially for this great project!

Candas1 commented 1 year ago

Thanks for sharing this. I thought about forcing fixed version of those tools in platformio to prevent such regressions, but it wouldn't have helped in your case.

One idea is also to use devcontainers

Candas1 commented 1 year ago

Or this

wlamers commented 1 year ago

Thanks Candas. Ah yes that is way better than being dependant on local machine related packages. Would have saved me a lot of trouble ;)

Candas1 commented 1 year ago

If it's not too late, would you be able to try with the problematic version of newlib adding this piece of code here and here in the while loop so like this: while(1) {
for(uint32_t x = 0; x < 6400000; x++) asm("");
HAL_GPIO_TogglePin(LED_PORT, LED_PIN);
}

the led would blink if entering those interrupts