Marus / cortex-debug

Visual Studio Code extension for enhancing debug capabilities for Cortex-M Microcontrollers
MIT License
1.01k stars 240 forks source link

Debugger exits if breakpoint is hit in FreeRTOS task #329

Closed sidprice closed 4 years ago

sidprice commented 4 years ago

I am debugging an Ambiq Apollo3 target (Cortex-M) that uses FreeRTOS, the debug interface is a JLink.. I find that if I run the code without breakpoints it appears to work. However, if I have a breakpoint set in a FreeRTOS task, when that breakpoint is hit the debugger exits. The following is displayed in the "Debug Output" window:

Breakpoint 2, LedTask (pvParameters=<optimized out>) at ./src/led_task.c:273
273         if (bitSet != 0)
/tmp/jenkins/jenkins-GCC-7-build_toolchain_docker-775_20180622_1529687456/src/gdb/gdb/inline-frame.c:167: internal-error: void inline_frame_this_id(frame_info*, void**, frame_id*): Assertion `frame_id_p (*this_id)' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? 
(y or n) [answered Y; input not from terminal]

The "Output" window shows:

Setting breakpoint @ address 0x0000C55A, Size = 2, BPHandle = 0x0003
Starting target CPU...
...Breakpoint reached @ address 0x0000C55A
Reading all registers
Removing breakpoint @ address 0x0000C55A, Size = 2
Read 4 bytes @ address 0x0000C55A (Data = 0x28004604)
Reading 64 bytes @ address 0x10003700
Read 4 bytes @ address 0x0000C970 (Data = 0x681B4B0A)
Reading register (MSP = 0x100022E0)
Reading register (PSP = 0x10003700)
Reading register (PRIMASK = 0x       0)
Reading register (BASEPRI = 0x       0)
Reading register (FAULTMASK = 0x       0)
Reading register (CONTROL = 0x       2)
Reading register (FPSCR = 0x 2000000)
Reading register (s0 = 0x       0)
Reading register (s1 = 0x       0)
Reading register (s2 = 0x       0)
Reading register (s3 = 0x       0)
Reading register (s4 = 0x       0)
Reading register (s5 = 0x       0)
Reading register (s6 = 0x       0)
Reading register (s7 = 0x       0)
Reading register (s8 = 0x       0)
Reading register (s9 = 0x       0)
Reading register (s10 = 0x       0)
Reading register (s11 = 0x       0)
Reading register (s12 = 0x       0)
Reading register (s13 = 0x       0)
Reading register (s14 = 0x       0)
Reading register (s15 = 0x       0)
Reading register (s16 = 0x       0)
Reading register (s17 = 0x       0)
Reading register (s18 = 0x       0)
Reading register (s19 = 0x       0)
Reading register (s20 = 0x       0)
Reading register (s21 = 0x       0)
Reading register (s22 = 0x       0)
Reading register (s23 = 0x       0)
Reading register (s24 = 0x       0)
Reading register (s25 = 0x       0)
Reading register (s26 = 0x       0)
Reading register (s27 = 0x       0)
Reading register (s28 = 0x       0)
Reading register (s29 = 0x       0)
Reading register (s30 = 0x       0)
Reading register (s31 = 0x       0)
Reading register (d0 = 0x       0)
Reading register (d1 = 0x       0)
Reading register (d2 = 0x       0)
Reading register (d3 = 0x       0)
Reading register (d4 = 0x       0)
Reading register (d5 = 0x       0)
Reading register (d6 = 0x       0)
Reading register (d7 = 0x       0)
Reading register (d8 = 0x       0)
Reading register (d9 = 0x       0)
Reading register (d10 = 0x       0)
Reading register (d11 = 0x       0)
Reading register (d12 = 0x       0)
Reading register (d13 = 0x       0)
Reading register (d14 = 0x       0)
Reading register (d15 = 0x       0)
Read 4 bytes @ address 0x0000C970 (Data = 0x681B4B0A)
GDB closed TCP/IP connection (Socket 856)

Could you help me to understand what is happening and perhaps how I can fix the problem.

sidprice commented 4 years ago

FYI: I updated to the latest ARM toolchain but still have the same issue:

GNU gdb (GNU Arm Embedded Toolchain 9-2020-q2-update) 8.3.1.20191211-git Copyright (C) 2019 Free Software Foundation, Inc.

haneefdm commented 4 years ago

@sidprice Please contact the vendor. It could be an issue with SEGGER or even gdb. You may want to duplicate the problem with gdb/jlink command-line interface first.

I have seen such problems on the internet and it could even be a bad/mismatched elf file. Just google for "A problem internal to GDB has been detected"

If gdb is crashing, there is not much we can do.

sidprice commented 4 years ago

@haneefdm Why did you close this without allowing me to comment?

I HAVE tested using GDB with the JLink server outside of VSCode and Cortex0Debug and it works just fine!

haneefdm commented 4 years ago

Sorry, you can still comment and we can still respond and reopen. If gdb is crashing, generally not much we can do. It appears to be crashing because it cannot decode the stack frame. We are not involved in that.

I wish to see your launch.json, Debug Console, and Adapter Output logs with `"showDevDebugOutput": true" in your launch.json. Also, your manual stuff too. Are we doing the same thing?

With the info you already provided I am not sure how I can help.

sidprice commented 4 years ago

Thanks for the pointers on how to gather more information.

It appears that, even with GDB directly, the command "stack-info-depth --thread 1 10000" is the one that crashes GDB. I have raised a ticket in the launchpad site.

Is there anything I can do to work around this situation?

haneefdm commented 4 years ago

Let me check...It appears that the stack frames in FreeRTOS is corrupted. Generally, to me, that means that there is a bug in JLink (or whatever gdb-server you are using). Probably not in gdb itself

Can you try with a smaller depth? Can you try OpenOCD or pyocd temporarily instead of JLink?

sidprice commented 4 years ago

I tried a smaller depth, same result.

I don't use OpenOCD or pyocd unfortunately. I will try updating JLink GDBserver.

sidprice commented 4 years ago

Sorry to bother you about this, I don't expect to get a response anytime soon to my GDB ticket so I need to figure a way of being able to keep my project moving forward.

If you could point me to a reference for getting OpenOCD up, assuming it would work with the MCU I have (Ambiq Apollo3), I need to give it a try. I assume it bypasses the JLink GDBServer and uses the JLink probe directly?

Again, thanks for you time and help, it is much appreciated.

haneefdm commented 4 years ago

I would actually contact SEGGER first. The issue may actually lie there first.

sidprice commented 4 years ago

@haneefdm Thanks for the suggestion, so far no response from Segger.

I have downloaded and installed OpenOCD but I cannot get it to connect to my JLink. I tried changing the driver to WinUSB, but that did not help.

Are you familiar with using JLink with OpenOCD?

Here is the response I get with the Segger driver installed: PS C:\utils\xpack-openocd-0.10.0-14\bin> .\openocd.exe -f interface\jlink.cfg xPack OpenOCD, x86_64 Open On-Chip Debugger 0.10.0+dev-00378-ge5be992df (2020-06-26-09:29) Licensed under GNU GPL v2 For bug reports, read http://openocd.org/doc/doxygen/bugs.html Info : Listening on port 6666 for tcl connections Info : Listening on port 4444 for telnet connections Warn : Failed to open device: LIBUSB_ERROR_NOT_SUPPORTED. Error: No J-Link device found.

With WinUSB installed it simply says no JLink found.

haneefdm commented 4 years ago

Sorry, @sidprice I have no answers. One last thing I can recommend is to contact your chip vendor. If you cannot use gdb from the command line, the buck stops with them.

sidprice commented 4 years ago

@haneefdm Thanks for the moral support, it is much appreciated. I am able to use GDB directly because I can avoid doing the stack info commands that Cortex-Debug always issues. It would be outstanding if I could optionally disable the part of Cortex-Debug that sends those commands. Using GDB directly is a little bit of a pain.

haneefdm commented 4 years ago

@sidprice We can't disable the stack trace at all because none of the windows will update because it is all dependent on the stack frames. Debugger becomes useless and the whole VSCode infrastructure will fail

If you are using gdb manually for instance and you hit a breakpoint or exception, then what you can do is to run a backtrace command. That is pretty much what we do on the current thread except in machine interface

https://sourceware.org/gdb/onlinedocs/gdb/Backtrace.html

sidprice commented 4 years ago

@haneefdm I tried the backtrace and got the same error Cortex-Debug gets:

(gdb) backtrace
#0  usfsmax_task (pvParameters=<optimized out>) at ./src/usfsmax_task.c:354
#1  0x0000d4a4 in xEventGroupSetBits (
/mnt/workspace/workspace/GCC-9-pipeline/jenkins-GCC-9-pipeline-200_20200521_1590053374/src/gdb/gdb/inline-frame.c:156: internal-error: void inline_frame_this_id(frame_info*, void**, frame_id*): Assertion `frame_id_p (*this_id)' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)

I am trying to get OpenOCD working, but so far have not been able to find a target configuration for the Apollo3. Ambiq technical support is a complete waste of time.

haneefdm commented 4 years ago

Feel bad for you. You have to push your chip vendor on this. Even OpenOCD, the chip vendor has to provide proper config files. Not easy to create your own. I would say near impossible.

So, what does Ambiq recommend for a dev. env.? I could not even get to their website. It was down.

sidprice commented 4 years ago

Ambiq recommends Keil, IAR as dev. environments. Their examples also have makefiles for GCC.

I think the problem here may be GDB because I have read other people having issues with FreeRTOS and GDB. Unfortunately, I don't find any solution, other than not using the stack information commands or backtrace.

Provided I don't use those commands when using GDB directly I can reliably debug my project. However, as I am sure you can imagine, it is not as nice an experience as using Cortex-Debug in VSCode. I find myself switching between GDB and VSCode to browse my code and decide on my next debugging move. I do not find the lack of the backtrace too limiting, at least so far in the project.

Pressing Ambiq is and has been a lost cause. My last interaction with a "support" engineer ended with him saying something like, "we are a small company so producing documentation for the SDK API is something we are not able to do. You should read the source code."

Thanks.

gbingersoll commented 2 years ago

I know this is a couple years old at this point, but I just ran into a similar problem. Not with this VS Code plugin, but an issue with GDB, Segger, FreeRTOS. I found that the GDB MI command -stack-info-depth --thread 5 (the 5 is irrelevant) would completely hang GDB and never receive a response. However, I found that this only seems to happen if I compile FreeRTOS with -O0 optimization. If I compile FreeRTOS with, e.g., -Og, and the rest of my code with -O0, it works fine. I suspect a bug in Segger's FreeRTOS plugin. At least changing the FreeRTOS compile optimization is a possible workaround for others Googling this problem.

Incidentally, I also happen to be using Ambiq Apollo3, but I would be surprised if it is specific to that chip.

haneefdm commented 2 years ago

About compiling... -Og is a good one. If you do use -O0 or -O1 or -O2 don't forget the -g. The g is super important because, without that, there is no symbol or line information, and is very difficult to debug. The -Og implies -g but other -O options don't so you have to add the -g

From the manual below

-Og Optimize debugging experience. -Og should be the optimization level of choice for the standard edit-compile-debug cycle, offering a reasonable level of optimization while maintaining fast compilation and a good debugging experience. It is a better choice than -O0 for producing debuggable code because some compiler passes that collect debug information are disabled at -O0.

Like -O0, -Og completely disables a number of optimization passes so that individual options controlling them have no effect. Otherwise -Og enables all -O1 optimization flags except for those that may interfere with debugging:

-fbranch-count-reg -fdelayed-branch -fdse -fif-conversion -fif-conversion2
-finline-functions-called-once -fmove-loop-invariants -fssa-phiopt -ftree-bit-ccp -ftree-dse -ftree-pta -ftree-sra

gbingersoll commented 2 years ago

You're right that if you use -O then you also have to add -g to get debug info. My point is just that even with -Og, the compiler often over-optimizes so you cannot see variable values when you step through code. If you keep seeing, "optimized out" with -Og, you can go to -O0 to fix this. However, compiling FreeRTOS with -O0 seems to result in problems with the Segger FreeRTOS plugin (I suspect) and getting stack info. So I set my application code to compile with -g -O0 to prevent "optimized out" and compile FreeRTOS with -Og so the debugger works as expected.