kanflo / opendps

Give your DPS5005 the upgrade it deserves
MIT License
895 stars 124 forks source link

Not working with DPS3005 and OpenOCD 0.10.0 (20190423) #144

Closed X-Ryl669 closed 5 years ago

X-Ryl669 commented 5 years ago

I've a DPS3005, macOS, and did the following:

  1. Everything in the upgrade guide (except using latest OpenOCD vs 2016 version you refered to).

  2. Modified the opendps/opendps/Makefile model line to read: MODEL := DPS3005

  3. Built everything without issue (I have ARM GCC toolchain based on 7.2.1 from 20170904.

  4. Tried to launch openocd but it failed because the script where outdated with last version

  5. I've run openocd from /usr/local/gnu-mcu-eclipse/openocd/0.10.0-12-20190422-2015/scripts like this: $ openocd -f interface/stlink-v2.cfg -f target/stm32f1x.cfg

  6. Got this output:

    GNU MCU Eclipse OpenOCD, 64-bitOpen On-Chip Debugger 0.10.0+dev-00593-g23ad80df4 (2019-04-23-00:01)
    Licensed under GNU GPL v2
    For bug reports, read
    http://openocd.org/doc/doxygen/bugs.html
    Info : auto-selecting first available session transport "hla_swd". To override use 'transport select <transport>'.
    Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
    adapter speed: 1000 kHz
    adapter_nsrst_delay: 100
    none separate
    Info : Listening on port 6666 for tcl connections
    Info : Listening on port 4444 for telnet connections
    Info : clock speed 1000 kHz
    Info : STLINK V2J29S7 (API v2) VID:PID 0483:3748
    Info : Target voltage: 3.221551
    Info : stm32f1x.cpu: hardware has 6 breakpoints, 4 watchpoints
    Info : Listening on port 3333 for gdb connections
  7. Saved the 5V on/off, 3.3V on/off file (worked great), I'm not pasting them here because they are huge, let me know if you need them

  8. Tried to unlock the flash, but this failed:

    
    telnet localhost 4444
    Trying ::1...
    Connection failed: Connection refused
    Trying 127.0.0.1...
    Connected to localhost.
    Escape character is '^]'.
    Open On-Chip Debugger
    > reset halt
    target halted due to debug-request, current mode: Thread
    xPSR: 0x01000000 pc: 0x080001d0 msp: 0x20000a08
    > flash erase_address unlock 0x08000000 0x10000
    device id = 0x10016420
    STM32 flash size failed, probe inaccurate - assuming 128k flash
    flash size = 128kbytes
    stm32x device protected
    failed erasing sectors 0 to 63

shutdown command invoked

then, after reboot, the DPS3005 did no show anything on the screen, but I got exactly same message. Even this command said:

stm32f1x unlock 0 stm32x unlocked. INFO: a reset or power cycle is required for the new settings to take effect. reset halt target halted due to debug-request, current mode: Thread xPSR: 0x01000000 pc: 0x00010100 msp: 0x464c457c flash erase_address unlock 0x08000000 0x10000 stm32x device protected failed erasing sectors 0 to 63

9. So, I started my ST-UTIL tool on Windows, and removed all write protection for all sectors (via "Options register" menu). 
10. Then I checked the erase could be done via OpenOCD:

reset halt target halted due to debug-request, current mode: Thread xPSR: 0x01000000 pc: 0x00010100 msp: 0x464c457c flash erase_address unlock 0x08000000 0x10000 device id = 0x10016420 flash size = 64kbytes erased address 0x08000000 (length 65536) in 0.084647s (756.081 KiB/s)

so it appears to have worked.
11. Then I tried to flash the opendps firmware, but got this:

make -C opendps flash FLASH opendps.elf (echo "halt; program ./opendps/opendps/opendps.elf verify reset" | nc -4 localhost 4444 2>/dev/null) || \ openocd -f interface/stlink-v2.cfg \ -f target/stm32f1x.cfg \ -c "program opendps.elf verify reset exit" \ 2>/dev/null ��������Open On-Chip Debugger halt; program ./opendps/opendps/opendps.elf verify reset target halted due to debug-request, current mode: Thread xPSR: 0x61000000 pc: 0x0800091c msp: 0x20001f48 target halted due to debug-request, current mode: Thread xPSR: 0x01000000 pc: 0x080010d0 msp: 0x20001ff0 Programming Started auto erase enabled timeout waiting for algorithm, a target reset is recommended flash write failed at address 0x8000000 error writing to flash at address 0x08000000 at offset 0x00000000 embedded:startup.tcl:479: Error: Programming Failed in procedure 'program' in procedure 'program_error' called at file "embedded:startup.tcl", line 538 at file "embedded:startup.tcl", line 479

Yet, the dpsboot flashed correctly:

make -C dpsboot flash FLASH dpsboot.elf (echo "halt; program ./opendps/dpsboot/dpsboot.elf verify reset" | nc -4 localhost 4444 2>/dev/null) || \ openocd -f interface/stlink-v2.cfg \ -f target/stm32f1x.cfg \ -c "program dpsboot.elf verify reset exit" \ 2>/dev/null ��������Open On-Chip Debugger halt; program ./opendps/dpsboot/dpsboot.elf verify reset target halted due to debug-request, current mode: Handler HardFault xPSR: 0x00000003 pc: 00000000 msp: 0x464c455c target halted due to debug-request, current mode: Thread xPSR: 0x01000000 pc: 0x00010100 msp: 0x464c457c Programming Started auto erase enabled wrote 5120 bytes from file /Users/cyril/tmp/opendps/dpsboot/dpsboot.elf in 0.338600s (14.767 KiB/s) Programming Finished Verify Started verified 4720 bytes in 0.115515s (39.903 KiB/s) Verified OK Resetting Target


12. Since OpenOCD add trouble to flash the elf file, I've converted opendps.elf to opendps.bin via `arm-none-eabi-objcopy -O binary -S opendps.elf opendps.bin`, and used ST-UTIL on Windows to flash it successfully at address 0x08000000. 
13. Now, everything is flashed, but the DPS does not display anything and I don't know if it's starting.
14. I've also tried to flash the dpsboot again (so application **then** bootloader) but it's still not starting. I've dumped the flash and it's bit for bit exact to the bootloader or app (either one).

So, this let me think about the issues I've found:

  1. What is the expected address for the program to flash ? I see that (check error message above) by default, openocd flashes the elf file at address 0x8000000 so both app and bootloader are being flashed at the same place, and that would explain why you need to flash the bootloader after the app (but that does not explain why the app would work at all without the first ~5000 bytes???)
  2. I've dumped the differences between both openocd version scripts and they are minor. I don't know why openocd does not want to unlock my stm32's flash, and why it fails to program the app correctly (but say happily work to flash the bootloader).
X-Ryl669 commented 5 years ago

I've read the code and it seems that the app needs to be flashed at 0x8001400, and it worked... once ?

kanflo commented 5 years ago

Did it boot to the application once? Hmmm... What happens if you start the DPS and halt it using OpenOCD? What is the value of PC?

X-Ryl669 commented 5 years ago

Got some progress. Via gdb, I see that it's crashing in ili9163c_fill_screen in the for loop (it resets and since it's one of the first function that's called to erase the screen... I got a boot loop). I've tried to reduce the size of the window (via gdb, I set the i variable to 3000 to let it exit the loop, and I got a screen displaying something but it crashed later on).

X-Ryl669 commented 5 years ago

When the screen is displayed, I do see the wifi icon that's blinking. The main issue I have is I'm not sure if openocd program write more than the .text section to flash. I think if I succeed debugging openocd I might get progress, but openocd is behaving weirdly for me. When I issue this:

flash info 0
device id = 0x10016420
flash size = 64kbytes
#0 : stm32f1x at 0x08000000, size 0x00010000, buswidth 0, chipwidth 0
    #  0: 0x00000000 (0x1000 4kB) not protected
    #  1: 0x00001000 (0x1000 4kB) not protected
    #  2: 0x00002000 (0x1000 4kB) not protected
    #  3: 0x00003000 (0x1000 4kB) not protected
    #  4: 0x00004000 (0x1000 4kB) not protected
    #  5: 0x00005000 (0x1000 4kB) not protected
    #  6: 0x00006000 (0x1000 4kB) not protected
    #  7: 0x00007000 (0x1000 4kB) not protected
    #  8: 0x00008000 (0x1000 4kB) not protected
    #  9: 0x00009000 (0x1000 4kB) not protected
    # 10: 0x0000a000 (0x1000 4kB) not protected
    # 11: 0x0000b000 (0x1000 4kB) not protected
    # 12: 0x0000c000 (0x1000 4kB) not protected
    # 13: 0x0000d000 (0x1000 4kB) not protected
    # 14: 0x0000e000 (0x1000 4kB) not protected
    # 15: 0x0000f000 (0x1000 4kB) not protected
STM32F100 (Low/Medium Density) - Rev: Z

I would have expected to find 64 pages of 1024 bytes (that's what ST-LINK is showing on Windows). Here it's reporting 16 pages of 4096 bytes. Obviously, it's failing to program the main program at address 0x8001400, since only 0x8001000 and 0x8002000 are possible at this granularity.

kanflo commented 5 years ago

That is indeed strange. The device ID matches mine and the STM32F100 has, as you say, 64 1kb pages. Could you try to alter the start address of the app (stm32f100_app.ld) to 0x08002000 (remember to change this address in stm32f100_boot.ld too) to see if that helps. You should probably remove all flash access in past.c.

X-Ryl669 commented 5 years ago

No, it does not change anything finally. I even tried to install the same openocd version as yours and it does not work either (even using your openocd/scripts). If I skip the spi_dma_trancieve it goes further in the boot and "blink" due to reboot loop, but at least it display the main screen each time.

void ili9163c_fill_screen(uint16_t color)
{
    uint32_t i;
    uint8_t hi = color >> 8;
    uint8_t lo = color & 0xff;
    uint8_t fill[] = {hi, lo, hi, lo, hi, lo, hi, lo, hi, lo, hi, lo, hi, lo, hi, lo};
    uint8_t dummy[sizeof(fill)];
    gpio_clear(TFT_A0_PORT, TFT_A0_PIN);
//    ili9163c_set_window(0, 0, _GRAMWIDTH+2, _GRAMHEIGH); // Note! For some reason filling WxH is results in two vertical lines to the far right...
    ili9163c_set_window(0, 0, _GRAMWIDTH, 128);//_GRAMHEIGH);
    gpio_set(TFT_A0_PORT, TFT_A0_PIN);

//    for (i = 0;i < 128*128/*(_GRAMWIDTH * _GRAMHEIGH)*//(sizeof(fill)/2); i++) {
//        (void) spi_dma_transceive((uint8_t*) fill, sizeof(fill), (uint8_t*) dummy, sizeof(dummy));
//    }

}

BTW, I don't have a "red" nor "black" PCB (tried both options, none work) but a green one for the screen PCB.

X-Ryl669 commented 5 years ago

Ok, I've made some progress here:

  1. I've set up breakpoints in all handler in the vector table (successively). None of them are being called in the "reboot" loop, so I've deduced that it's not rebooting.
  2. I've reset the ili9163c_fill_screen method to the genuine one, and added a dbg_printf("i: %u\n", i) after the for loop, and the message goes through. Yet, it's never going further.
  3. I've disassembled the code here and go this:
    [...]
    76:   f7ff fffe       bl      0 <spi_dma_transceive>
    7a:   3c01            subs    r4, #1
    7c:   d1f6            bne.n   6c <ili9163c_fill_screen+0x6c>
    7e:   f640 2128       movw    r1, #2600       ; 0xa28
    82:   4803            ldr     r0, [pc, #12]   ; (10 <dbg_printf+0x10>)
    84:   f7ff fffe       bl      0 <dbg_printf>
    88:   b008            add     sp, #32
    8a:   bd10            pop     {r4, pc}
    8c:   40010c00        .word   0x40010c00
    90:   00000000        .word   0x00000000

    For address 8a, we see that it's setting the PC register (and r4) to what was on the stack at this call time. I've singlestepped until the debugger was at this point and I had the address of main() / delay_ms() function in the stack here, instead of the address of what's after tft_clear(). So, what I observed was a loop across function boundaries (typically, the fill_screen function does not return, but restore the stack and fiddle with the PC register directly, and instead of storing the return address (that is, after the tft_clear() line), it stores the address before it. For me, it's either the compiler that's behaving clunkly (and wrong), or a stack corruption that has corrupted the return address.

Can you state what is your compiler version ? I have arm-none-eabi-gcc (GNU Tools for Arm Embedded Processors 7-2017-q4-major) 7.2.1 20170904 (release) [ARM/embedded-7-branch revision 255204]

I've digging further to check if the stack was corrupted (by looking at what it is when entering the function).

X-Ryl669 commented 5 years ago

My mistake. The stack is not currupted, it's just that fill_screen is also called in tft_init so having delay_ms as return point is actually correct. It's the latter fill_screen in tft_clear that does not return.

X-Ryl669 commented 5 years ago

I'm a bit struck here. Is there a watchdog on this system ?

X-Ryl669 commented 5 years ago

Ok, I've spent a lot of time on this and finally found the solution. I've added this function in the bootloader:

#define RCC_CSR_RESET_FLAGS (RCC_CSR_LPWRRSTF | RCC_CSR_WWDGRSTF |\
                RCC_CSR_IWDGRSTF | RCC_CSR_SFTRSTF | RCC_CSR_PORRSTF |\
RCC_CSR_PINRSTF)

const char * get_system_reset_cause(void)
{
    const char * reset_cause = "TBD";
    uint32_t rccFlag = RCC_CSR & RCC_CSR_RESET_FLAGS;

    if (rccFlag & RCC_CSR_LPWRRSTF)
    {
        reset_cause = "LOW_POWER_RESET";
    }
    else if (rccFlag & RCC_CSR_WWDGRSTF)
    {
        reset_cause = "WINDOW_WATCHDOG_RESET";
    }
    else if (rccFlag & RCC_CSR_IWDGRSTF)
    {
        reset_cause = "INDEPENDENT_WATCHDOG_RESET";
    }
    else if (rccFlag & RCC_CSR_SFTRSTF)
    {
        reset_cause = "SOFTWARE_RESET"; // This reset is induced by calling the ARM CMSIS `NVIC_SystemReset()` function!
    }
    else if (rccFlag & RCC_CSR_PORRSTF)
    {
        reset_cause = "POWER-ON_RESET (POR) / POWER-DOWN_RESET (PDR)";
    }
    else if (rccFlag & RCC_CSR_PINRSTF)
    {
        reset_cause = "EXTERNAL_RESET_PIN_RESET";
    }

    // Clear all the reset flags or else they will remain set during future resets until system power is fully removed.
    RCC_CSR ^= rccFlag;

    return reset_cause;
}

And figured out that the reset was caused by INDEPENDENT_WATCHDOG_RESET. I've checked your code and found out there was no place where you were dealing with this IWDG. Then, I was thinking about why nobody had the same issue as I had, and remembered the flash protection issue, where each bank was protected. So I double checked the flash's options byte. And bingo, here's what I had: image The WDG_SW bit help is "Unchecked: HW independant watchdog, Checked: Software watchdog", so once checked the IWDG wasn't enabled by hardware upon reset and the system worked! It seems like the manufacturer enabled those bits in addition to the flash protection for my system.

Maybe it's worth summing up all the change that I had to do to get this to work:

  1. openocd's unlocking does not work on my system, so I had to use official ST-LINK utility to remove protection for all flash pages. This is in Target / Options byte page of the tool
  2. Because openocd failed to program the elf file, I had to do that as well in ST-LINK. But ST-LINK does not accept elf file, so you must convert them to .bin file via this command:
    arm-none-eabi-objcopy -O binary -S dpsboot.elf dpsboot.bin
  3. Remember to flash dpsboot.bin at address 0x8000000 and opendps.bin at address 0x8001400 (please select the bin file first because when you do, the address is reset to 0x8000000, then change the address)

Hope it helped!

kanflo commented 5 years ago

Glad you sorted it out and thanks for the detailed writeup, it will help anyone unlocking using ST-LINK to unlock. On a side note the IWDG should be used to make sure we never miss an OCP interrupt.