Samsung / netcoredbg

NetCoreDbg is a managed code debugger with MI interface for CoreCLR.
MIT License
825 stars 104 forks source link

stackTrace regression from 1.2.0-782 to 1.2.0-825 #65

Closed sebaFlame closed 3 years ago

sebaFlame commented 3 years ago

I'm running Vimspector in Vim. Updated it to run 1.2.0-825. Debugging tests running through attach (VSTEST_HOST_DEBUG=1 dotnet test) on .NET 5. Everything runs pretty much the same until a breakpoint is hit.

Log for 1.2.0-782 - netcoredbg.engine_782.log Log for 1.2.0-825 - netcoredbg.engine_825.log

sebaFlame commented 3 years ago

Related Vimspector issue: puremourning/vimspector#448

sebaFlame commented 3 years ago

Running in WSL (debian 11).

viewizard commented 3 years ago
-> (C) {"command": "stackTrace", "arguments": {"threadId": 549}, "seq": 30, "type": "request"}
<- (R) {"seq":247,"command":"stackTrace","message":"Failed command 'stackTrace' : 0x80131c49","request_seq":30,"success":false,"type":"response"}

0x80131c49 is A call into a ReadVirtual implementation returned failure., probably during query interface. This could be dbgshim related issue, could you try use dbgshim.dll from 782 for test?

Could you please provide debugger's log for 825? See --log and --engineLogging[=<path to log file>] options.

sebaFlame commented 3 years ago

I copied libdbgshim.so from 782. Called the debugger with: "--interpreter=vscode --engineLogging=${workspaceRoot}/netcoredbg.engine.log --log". It did not throw an error this time! Log from 825 - netcoredbg.engine.log

viewizard commented 3 years ago

Hmm... we copy dbgshim.dll from latest MS SDK during packaging.

viewizard commented 3 years ago

Wait, we don't have linux package for 825, where did you get libdbgshim.so? Ups, we have it for amd64 only =)

viewizard commented 3 years ago

I got the idea, that was manual build, and looks like I publish with dbgshim from 3.1 (have it by default)... not with dbgshim from latest SDK.

sebaFlame commented 3 years ago

libdbgshim 825 from netcoredbg-linux-amd64.tar.gz 782 from netcoredbg-linux-bionic-amd64.tar.gz

viewizard commented 3 years ago

Yes, netcoredbg-linux-bionic-amd64.tar.gz was built by CI with latest dbgshim (from 5.0 SDK). Unfortunately, CI are broken and I was forced to built manually.

viewizard commented 3 years ago

@sebaFlame I upload to 825 Release netcoredbg-linux-amd64_fixed.tar.gz file with proper libdbgshim, could you please check?

sebaFlame commented 3 years ago

It returns the same "stackFrames" as with the libdbgshim from 782. It sees everything as [External Code] and only returns these after about 1 minute after the stackTrace command has been sent from Vimspector resulting in a timeout in Vimspector.

sebaFlame commented 3 years ago

You might've missed the edits I made in https://github.com/Samsung/netcoredbg/issues/65#issuecomment-915169428 . It was already late :/ .

viewizard commented 3 years ago

Now netcordbg return [External Code] in case frame don't belong to user code (code, that have debug info) since we can't provide proper trace for some cases, same behavior as MS vsdbg have. In your log I see it found first frame -

{"column":17,"endColumn":60,"endLine":49,"id":0,"line":49,"moduleId":"8773a770-f447-4d90-ac75-f3a02742dd1d","name":"CNFDotnet.Tests.LL1LR0Tests.LL1_And_LR0_Valid_Grammar()","source":{"name":"LL1LR0Tests.cs","path":"/home/cba/development/CNFDotnet/test/CNFDotnet.Tests/LL1LR0Tests.cs"}}

and 99 external frames ("totalFrames":100)... same 100 frames as you have in 782 log. Note, netcoredbg.engine.log looks like console output, not internal log... could you please provide internal log (use --log=filename or debugger create .log file with PID name)? File text should looks like:

...
6893984.915 E/NETCOREDBG(P   7, T   8): ioredirect.cpp: worker(307) > child process stdout/stderr reading error
6893984.915 I/NETCOREDBG(P   7, T   8): ioredirect.cpp: operator(135) > IORedirectHelper::worker: terminated
6893984.929 W/NETCOREDBG(P   7, T  16): waitpid.cpp: waitpid(84) > Process terminated without exiting; can't get exit code. Killed by signal 9. Assuming EXIT_FAILURE.
...
sebaFlame commented 3 years ago

I couldn't figure out how, but eventually found the log file in /tmp with PID name. Called netcoredbg with "--interpreter=vscode --engineLogging=${workspaceRoot}/netcoredbg.engine.log --log"

782.233 I/NETCOREDBG(P 483, T 483): main.cpp: main(380) > Netcoredbg started
782.233 I/NETCOREDBG(P 483, T 483): main.cpp: instantiate_protocol(110) > Creating protocol VSCodeProtocol
782.235 I/NETCOREDBG(P 483, T 483): main.cpp: main(417) > pidDebugee 0
782.235 I/NETCOREDBG(P 483, T 484): ioredirect.cpp: worker(164) > worker started
784.980 E/NETCOREDBG(P 483, T 485): breakpoints_line.cpp: ResolveLineBreakpoint(200) > m_sharedModules->ResolveBreakpoint(modAddress, bp_fullname, fullname_index, bp.linenum, resolvedPoints) : 0x80004005
785.074 E/NETCOREDBG(P 483, T 485): breakpoints_line.cpp: ResolveLineBreakpoint(200) > m_sharedModules->ResolveBreakpoint(modAddress, bp_fullname, fullname_index, bp.linenum, resolvedPoints) : 0x80004005
sebaFlame commented 3 years ago

I guess this is another CI issue?

viewizard commented 3 years ago

I guess this is another CI issue?

I don't think this is CI related. Unfortunately, I can't say what this issue related... log don't have any lines, that could explain anything. In the same time, protocol interaction log you provided also don't explain anything (all looks good - we have command and after that replay...). Could you provide simple example project, that I could use for reproduce this issue with debugger directly with CLI protocol?

sebaFlame commented 3 years ago

I just tried a simple HelloWorld application using the 825_fixed.

Environment (in WSL 1)

$ dotnet --info
.NET SDK (reflecting any global.json):
 Version:   5.0.401
 Commit:    4bef5f3dbf

Runtime Environment:
 OS Name:     debian
 OS Version:  11
 OS Platform: Linux
 RID:         debian.11-x64
 Base Path:   /usr/share/dotnet/sdk/5.0.401/

Host (useful for support):
  Version: 5.0.10
  Commit:  e1825b4928

.NET SDKs installed:
  5.0.401 [/usr/share/dotnet/sdk]

.NET runtimes installed:
  Microsoft.AspNetCore.App 5.0.10 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 2.2.8 [/usr/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 5.0.10 [/usr/share/dotnet/shared/Microsoft.NETCore.App]

To install additional .NET runtimes or SDKs:
  https://aka.ms/dotnet-download

Program.cs

using System;

namespace HelloWorld
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Hello World!");

            throw new InvalidOperationException();

            Console.ReadKey();
        }
    }
}

Debugger launch options: "--interpreter=vscode --engineLogging=${workspaceRoot}/netcoredbg.engine.log --log" With a breakpoint on the WriteLine.

Using launch It hits the breakpoint AND stops at the exception. Exceptions filters still don't get sent, but that's a Vimspector issue (puremourning/vimspector#448).

Using attach Same result as with launch.

The issue seems to lie within the dotnet test tool? I don't thinks the project matters, it's a library with a test project.

Project directory structure . |-src |-|-Library |-test |-|-Tests

sebaFlame commented 3 years ago

Attached a simple project exhibiting the same behaviour in my current environment (https://github.com/Samsung/netcoredbg/issues/65#issuecomment-923878953). I also attached the log files from attaching to it using "VSTEST_HOST_DEBUG=1 dotnet test".

TestDotnet.zip netcoredbg.engine.log netcoredbg.2381.log

viewizard commented 3 years ago

Hmm... this looks like you attach not to simple test debuggee process, but since debuggee process is xunit test itself, you attach to xunit that execute some of your code. Probably this is why we see 100 external frames. Not sure this is good idea (start code with xunit) for test debugger.

sebaFlame commented 3 years ago

What would you recommend to debug tests? Or is there a way to increase the max amount of stack frames?

viewizard commented 3 years ago

We need investigate this first, looks like debugger trace all stack first but show only 100 first frames. So, in case xunit and some recursive code, debugger could need a lot of time for stack trace... probably, not max amount should be increased (I think even 100 frames is huge amount), but debugger stack trace logic should be changed in order to prevent useless work (don't trace 101+ frames, for example).

For debugger tests we use 2 programs - "control" and "debuggee" (see test-suite folder with test framework and test sources), that wrote in the same sources (debuggee program source have control program code blocks inside). So, we "emulate" IDE work in control program... it start debugger, send commands and check responds.

viewizard commented 3 years ago

Hmm... I just checked debugger code and... we don't have any limits for stack trace. In case stackTrace command don't provide levels argument, all frames will be added to respond. Plus, last debugger version don't look at metadata in case code don't have JMC marker (it will mark such frame as [External Code] now), that mean, stackTrace must work even faster now...

viewizard commented 3 years ago

Fixed in upstream (stackTrace related changes reverted).