zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.48k stars 6.41k forks source link

Add ESP32-* OpenOCD / GDB Zephyr Thread Awareness #62791

Open EricNRS opened 11 months ago

EricNRS commented 11 months ago

Reporting this on ESP32-S3 since that was what I had tested, but this likely applies to all ESP32-* Xtensa and RISC-V cores.

OpenOCD v0.12.0 added Zephyr OS awareness and also supports the ESP32-S3 chip directly. However, setting CONFIG_DEBUG_THREAD_INFO=y results in a debug error:

Open On-Chip Debugger 0.12.0+dev-01324-gfb52ba4fa (2023-09-19-00:01)
Licensed under GNU GPL v2
For bug reports, read
    http://openocd.org/doc/doxygen/bugs.html
Info : only one transport option; autoselecting 'jtag'
Info : esp_usb_jtag: VID set to 0x303a and PID to 0x1001
Info : esp_usb_jtag: capabilities descriptor set to 0x2000
force hard breakpoints
can't read "_TARGETNAME": no such variable

Adding set _TARGETNAME esp32s3.cpu1 to the openocd.cfg file to bypass probing shows the OpenOCD error

Info : Zephyr: looking for target: esp32s3
Error: Could not find target in Zephyr compatibility list

Looking at the v0.12 OpenOCD code, the Zephyr RTOS support shows that only arm and arcv2 architectures are supported.

static struct zephyr_params zephyr_params_list[] = {
    {
        .target_name = "cortex_m",
        .pointer_width = 4,
        .callee_saved_stacking = &arm_callee_saved_stacking,
        .cpu_saved_nofp_stacking = &arm_cpu_saved_nofp_stacking,
        .cpu_saved_fp_stacking = &arm_cpu_saved_fp_stacking,
        .get_cpu_state = &zephyr_get_arm_state,
    },
    {
        .target_name = "cortex_r4",
        .pointer_width = 4,
        .callee_saved_stacking = &arm_callee_saved_stacking,
        .cpu_saved_nofp_stacking = &arm_cpu_saved_nofp_stacking,
        .cpu_saved_fp_stacking = &arm_cpu_saved_fp_stacking,
        .get_cpu_state = &zephyr_get_arm_state,
    },
    {
        .target_name = "hla_target",
        .pointer_width = 4,
        .callee_saved_stacking = &arm_callee_saved_stacking,
        .cpu_saved_nofp_stacking = &arm_cpu_saved_nofp_stacking,
        .cpu_saved_fp_stacking = &arm_cpu_saved_fp_stacking,
        .get_cpu_state = &zephyr_get_arm_state,

    },
    {
        .target_name = "arcv2",
        .pointer_width = 4,
        .callee_saved_stacking = &arc_callee_saved_stacking,
        .cpu_saved_nofp_stacking = &arc_cpu_saved_stacking,
        .get_cpu_state = &zephyr_get_arc_state,
    },
    {
        .target_name = NULL
    }
};

See the following links for the code

Describe the solution you'd like Add Zephyr RTOS thread awareness to OpenOCD / GDB to enable multi-threaded debugging when running west debug for ESP32-S3 targets (this would likely work for ESP32 and ESP32-S2 targets as well).

Describe alternatives you've considered No known alternatives to this for Zephyr RTOS other than using the tracing and logging frameworks for debugging.

rftafas commented 11 months ago

Just to have it registered here, we currently are not working on Zephyr OpenOCD and will do that after v0.12 is integrated (I can't promise when, though). For now, users need to move up to latest OpenOCD or use Espressif's.

For now, this tutorial (original - google translator) may be of help.

EricNRS commented 11 months ago

If anyone familiar with what this work entails could comment, it would be appreciated.

Since OpenOCD v0.12 already supports Zephyr and ESP32-*, it looks like the work may be limited to just implementing the following functions:

https://github.com/openocd-org/openocd/blob/master/src/rtos/zephyr.c#L789-L797

const struct rtos_type zephyr_rtos = {
        .name = "Zephyr",

        .detect_rtos = zephyr_detect_rtos,
        .create = zephyr_create,
        .update_threads = zephyr_update_threads,
        .get_thread_reg_list = zephyr_get_thread_reg_list,
        .get_symbol_list_to_lookup = zephyr_get_symbol_list_to_lookup,
};
erhankur commented 10 months ago

It's a bit complicated.

First, to make things work on dual-core Xtensa targets, we need to meet SMP requirements. We've already done a lot for FreeRTOS.

Dealing with stacking on Xtensa can be tricky, and while I haven't checked how Zephyr handles it, we've implemented some special functions for FreeRTOS. Nuttx might be a better reference.

Using Xtensa chips with one core can be the first step. SMP support can be added as a 2nd step.

I'm currently working on RISC-V thread awareness, so I'll look into Xtensa later. However, I can't say when exactly because it's not a priority for us right now.

If you want to have a look, in addition to struct rtos_type you need to fill rtos_register_stacking structs, and implement stack read functions. The rest should work as is. And check how it is done for Nuttx and FreeRTOS in the Espressif fork.

EricNRS commented 10 months ago

@erhankur Thank you for your reply. Single core is a good first step as SMP is not supported yet and AMP support was only recently added (at least for the ESP32-S3 target). I will follow your progress on the RISC-V thread awareness in case I get time to work on the Xtensa side.

erhankur commented 10 months ago

@EricNRS Initial work is here.

https://github.com/erhankur/openocd-esp32/tree/zephyr_riscv_thread_awareness

erhankur commented 9 months ago

@EricNRS I added esp32 support. Please check it out from the above repo. esp32s3 will be coming soon.

I had to make a change in the thread_info.c. I don't know if it is the correct way to get the stack pointer, though.

#elif defined(CONFIG_XTENSA)
    [THREAD_INFO_OFFSET_T_STACK_PTR] = offsetof(struct k_thread,
                        switch_handle),
#else
EricNRS commented 9 months ago

Awesome! Thank you for your work on this. On a tight schedule this week, but will give it a try when I have a chance.

erhankur commented 9 months ago

This is the example command to run OpenOCD with a single core config.

$openocd -c 'set ESP_FLASH_SIZE 0; set ESP_RTOS Zephyr; set ESP32_ONLYCPU 1' -s tcl -f board/esp32-ethernet-kit-3.3v.cfg
brunocosta22 commented 8 months ago

In my case, add the command "set ESP_FLASH_SIZE 0" in first line of my board openOCD config file "openocd.cfg", and Work!!

EricNRS commented 8 months ago

@erhankur - did you have a chance to add ESP32-S3 support, yet?

As a quick try, I copied the esp32 target into to esp32s3.

> diff --git a/src/rtos/zephyr.c b/src/rtos/zephyr.c
> index 12c3fb65b..eb2fa375e 100644
> --- a/src/rtos/zephyr.c
> +++ b/src/rtos/zephyr.c
> @@ -575,6 +575,12 @@ static struct zephyr_params zephyr_params_list[] = {
>                 .callee_saved_stacking = &esp32_callee_saved_stacking,
>                 .get_cpu_state = &zephyr_get_esp32_state,
>         },
> +       {
> +               .target_name = "esp32s3",
> +               .pointer_width = 4,
> +               .callee_saved_stacking = &esp32_callee_saved_stacking,
> +               .get_cpu_state = &zephyr_get_esp32_state,
> +       },
>         {
>                 .target_name = "esp32c2",
>                 .pointer_width = 4,

But I'm getting an odd error message during the openocd connection process if I add set ESP_RTOS Zephyr. If I leave it at the default set ESP_RTOS none, openocd works correct (just without Zephyr awareness).

Info : only one transport option; autoselecting 'jtag'
Info : esp_usb_jtag: VID set to 0x303a and PID to 0x1001
Info : esp_usb_jtag: capabilities descriptor set to 0x2000

# Found the Zephyr target, so that's good
Info : Zephyr: looking for target: esp32s3
Info : Zephyr: target known, params at 0x559d84328e08
Info : Zephyr: looking for target: esp32s3
Info : Zephyr: target known, params at 0x559d84328e08

# This error is very strange as esp_common.cfg:9 is the line "catch {[source [find target/esp_version.cfg]]}"
/home/work/openocd-v.0.12.0/install/share/openocd/scripts/target/esp_common.cfg:9: Error: 
at file "/home/work/openocd-v.0.12.0/install/share/openocd/scripts/target/esp_common.cfg", line 9
GNU gdb (Zephyr SDK 0.16.1) 12.1
erhankur commented 8 months ago

I have updated here

However, I see from my initial tests that stacking does not work well. I believe adding below is not correct.

#elif defined(CONFIG_XTENSA)
    [THREAD_INFO_OFFSET_T_STACK_PTR] = offsetof(struct k_thread,
                        switch_handle),
#else

Now I switched to another task and will come back later. But if you have time to add a solid way to get the stack_ptr from the Zephyr part, it would be perfect. More or less, OpenOCD implementation will be the same. We will adjust zephyr_get_xtensa_state function according to zephyr symbols and window settings. And maybe the stack_register_offset tables. Same for ESP32 btw.

erhankur commented 8 months ago

One more issue I have seen, the debug symbol address in OpenOCD looks different from the actual application. I couldn't investigate it deeply.

EricNRS commented 8 months ago

@erhankur - thanks for the update. I had to do the following changes to get OpenOCD and GDB to load correctly.

PROJECT/prj.conf CONFIG_DEBUG_THREAD_INFO=y

boards/xtensa/BOARD/support/openocd.cfg

set ESP_RTOS Zephyr
set ESP_FLASH_SIZE 0
set ESP32_S3_ONLYCPU 1

boards/xtensa/BOARD/board.cmake (to fix can't read "_TARGETNAME": no such variable), see https://github.com/zephyrproject-rtos/zephyr/issues/45778 for details: board_runner_args(openocd --target-handle=_CHIPNAME.cpu0)

scripts/west_commands/runners/openocd.py

--- a/scripts/west_commands/runners/openocd.py
+++ b/scripts/west_commands/runners/openocd.py
@@ -199,6 +199,11 @@ class OpenOcdBinaryRunner(ZephyrBinaryRunner):
         # Zephyr rtos was introduced after 0.11.0
         version_str = self.read_version().split(' ')[3]
         version = version_str.split('.')
+
+        # Parsing fails for "v0.12.0-esp32-20230921-87-ge45cd609" because of the 'v' prefix
+        print("Version: ", version)
+        if len(version[0]) > 1 and version[0][0] == 'v':
+            version[0] = version[0][1:]
         (major, minor, rev) = [self.to_num(i) for i in version]
         return (major, minor, rev) > (0, 11, 0)
EricNRS commented 7 months ago

@erhankur I have been using GDB extensively for the past week and the ability to finally step through the code without jumping between threads has been extremely useful. Thank you! Hopefully you will get a chance to finish the changes and get them merged.

Are there any plans to support TRAX for Zephyr? I do often see cases where the system stops in arch_system_halt() or in _DoubleExceptionVector and there is no backtrace available. Being able to enable TRAX for instruction tracing and then dumping the results would be a tremendous help.

For example, simply executing the code __ASSERT_NO_MSG(false); results in the following backtrace:

(gdb) bt
#0  arch_system_halt (reason=4) at /work/zephyrproject/zephyr/kernel/fatal.c:32
#1  0x4037c9ac in k_sys_fatal_error_handler (reason=4, esf=0x3fcad998 <shell_uart_stack+6584>)
    at /work/zephyrproject/zephyr/kernel/fatal.c:46
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

In the case of an __ASSERT(), I can dump the stacks of the threads and analyse them to determine the stack trace root cause, but that is a slow process and not always successful. In other more severe cases of a _DoubleExceptionVector the final details of the failure are often not recoverable and it is impossible to root-cause the issue. TRAX instruction handling could take care of both cases.

gerekon commented 7 months ago

Are there any plans to support TRAX for Zephyr? I do often see cases where the system stops in arch_system_halt() or in _DoubleExceptionVector and there is no backtrace available. Being able to enable TRAX for instruction tracing and then dumping the results would be a tremendous help.

TRAX is supported by OpenOCD via xtensa tracestart|tracestop|tracedump commands. In IDF can enable it via menuconfig. In Zephyr I suppose you have to add some code to enable it like it is done in IDF https://github.com/espressif/esp-idf/blob/b3f7e2c8a4d354df8ef8558ea7caddc07283a57b/components/esp_system/port/cpu_start.c#L668-L675

OpenOCD uses TRAX registers via JTAG, controls tracing and dumps data collected in trace memory.

gerekon commented 7 months ago

GDB does not support TRAX traces, so you have to parse it manualy or use script from IDF https://github.com/espressif/esp-idf/blob/b3f7e2c8a4d354df8ef8558ea7caddc07283a57b/components/xtensa/trax/traceparse.py. It can be used in GDB to dump execution path basing on collected data.

EricNRS commented 7 months ago

@gerekon Thanks Alexey for the trace script -- I hadn't run across that before. The IDF calls will need to get linked into Zephyr which is sometimes easy and sometimes a nightmare. Might be easier once the IDF 5.1 port is done, though.

EricNRS commented 3 months ago

Hi Erhan (@erhanku) - have you done any more work on this recently? The changes work reasonably well and are infinitely better than no support, so it would be good to get them merged.

erhankur commented 3 months ago

Hi @EricNRS Still a bit busy with other tasks. No more work so far. But OK, we can consider to add some sort of support as is for the next release.