Samsung / netcoredbg

NetCoreDbg is a managed code debugger with MI interface for CoreCLR.
MIT License
800 stars 103 forks source link

Breakpoint maps to *nested* function instead of *outer* call #124

Closed noam-sol closed 1 year ago

noam-sol commented 1 year ago

Steps to reproduce

async Task<int> Method()
{
    ParallelOptions parallelOptions = new() { MaxDegreeOfParallelism = 3 };
    var userHandlers = new []{"a", "b","c","d"};

    await Parallel.ForEachAsync(userHandlers, parallelOptions, async (uri, token) =>
    {
        await new HttpClient().GetAsync("https://google.com");
    });

    return 0;
}

int result = await Method();

Add a breakpoint on line 6:await Parallel.ForEachAsync(userHandlers, parallelOptions, async (uri, token) =>

Mitigation

I have found that this patch resolves this, but it drops support from what the comment says here-

diff --git a/src/metadata/modules_sources.cpp b/src/metadata/modules_sources.cpp
index ef4ba4e..a735ceb 100644
--- a/src/metadata/modules_sources.cpp
+++ b/src/metadata/modules_sources.cpp
@@ -177,21 +177,20 @@ namespace
                 break;
             }
             // result was found on previous cycle, check for closest nested method
             // need it in case of breakpoint setuped at lines without code and before nested method, for example:
             // {
             //  <-- breakpoint at line without code (inside method)
             //     void Method() {...}
             // }
             else if (result && lineNum <= (*lower).startLine && (*lower).endLine <= result->endLine)
             {
-                closestNestedToken = (*lower).methodDef;
                 break;
             }
             else
                 break;
         }

         if (result)
         {
             auto find = multiMethodBpData.find(*result);
             if (find != multiMethodBpData.end()) // only constructors segments could be part of multiple methods

Proper solution?

Maybe on managed part? on SymbolReader.cs there's this condition - https://github.com/Samsung/netcoredbg/blob/db69338cf1606d8d327de0eaaf1694d8463af135/src/managed/SymbolReader.cs#L795

I added some prints -

if (nestedToken != 0)
{
    SequencePoint nested_p = FirstSequencePointForSourceLine(ref reader, nestedToken);
    StreamWriter sw = new StreamWriter("/tmp/managed.log");
    sw.WriteLine("current_p - " + current_p.StartLine + ":" + current_p.StartColumn + " until " + current_p.EndLine + ":" + current_p.EndColumn);
    sw.WriteLine("nested_p - " + nested_p.StartLine + ":" + nested_p.StartColumn + " until " + nested_p.EndLine + ":" + nested_p.EndColumn);
    sw.Close();
    if (current_p.EndLine > nested_p.EndLine || (current_p.EndLine == nested_p.EndLine && current_p.EndColumn > nested_p.EndColumn))
    {
        list.Add(new resolved_bp_t(nested_p.StartLine, nested_p.EndLine, nested_p.Offset, nestedToken));
        // (tokenNum > 1) can have only lines, that added to multiple constructors, in this case - we will have same for all Tokens,
        // we need unique tokens only for breakpoints, prevent adding nestedToken multiple times.
        break;
    }
}
nestedToken = 0; // Don't check nested block next cycle (will have same results).

list.Add(new resolved_bp_t(current_p.StartLine, current_p.EndLine, current_p.Offset, methodToken));

and they are -

current_p - 9:5 until 12:8
nested_p - 10:5 until 10:6

I'm not sure if the condition needs to be changed, or it's given some bad input. That's because the nested function is not really between 10:5 to 10:6, and actaully contains multiple lines. So I'm not sure how to really solve this properly.

viewizard commented 1 year ago

I see the point, async (uri, token) => ... is part of argument and should be parsed as part of method call, but not stand alone method body inside another method. This should be carefully investigated first (in order to detect argument-related part), since we have a lot of breakpoint resolve related cases that must care about.

I have found that this patch resolves this, but it drops support from what the comment says here-

This is wrong "fix" for sure, since it broke another case.

Proper solution?

SymbolReader.cs operate with debug info, that could provide "source code line" to "IL offset/method token" mapping. Debug info don't provide any info about code execution "hierarchy". The real fix should use metadata, that is part of debuggee process IMO (debugger should find out, is this method part of argument or not). We don't have such logic now, this should be investigated and implemented.

noam-sol commented 1 year ago

(debugger should find out, is this method part of argument or not)

it sounds a bit heuristic, isn't it? I can try implementing this if you guys think it's the right approach, but I'd like to suggest some other idea-

To check whether both functinos are on the same line, and in that case break on the outer one. does that sound ok? I'm thinking whether it should already work in this clause - https://github.com/Samsung/netcoredbg/blob/294177eb3b7c0822c5b3db25bdae2f54b2c3e03b/src/metadata/modules_sources.cpp#L156 However, I'm getting some bad (?) debug info - nested_p.StartLine is one line below current_p.StartLine although they are on the same source line.

viewizard commented 1 year ago

it sounds a bit heuristic, isn't it? I can try implementing this if you guys think it's the right approach

Unfortunately, I don't really sure in what way we should do this, not even sure my suggestion is correct.

To check whether both functinos are on the same line, and in that case break on the outer one.

I think, you can't implement proper fix from this point, since you don't have all info you need. Probably, we may fix this case by implement all parent method sequence points check, I believe this should looks like this:

void f()
{    <--- parent.StartLine (parent sequence point1 start, parent sequence point1 end)
    await Parallel.ForEachAsync( async (uri, token) =>     <--- (parent sequence point2 start)
    {                                                      <--- nested.StartLine
        await new HttpClient().GetAsync("https://google.com");
    }                                                      <--- nested.endLine  
    );                                                     <--- (parent sequence point2 end)
}    <--- parent.endLine  (parent sequence point2 start, parent sequence point2 end)
void f()
{    <--- parent.StartLine (parent sequence point1 start, parent sequence point1 end)
    void ForEach( async (uri, token)
    {                                      <--- nested.StartLine
        return;
    };                                     <--- nested.endLine  
}    <--- parent.endLine  (parent sequence point3 start, parent sequence point3 end)

breakpoint resolve should care about all methods data in case line have code, but, should skip nested method that is part of some parent method sequence point in case breakpoint set at line without code and should be "corrected" to first line with code. But at this moment debugger's breakpoint-related structures don't store info we need for this check and managed part don't have related code.

Unfortunately, I can't tell you idea you could implement, this should be investigated carefully first, but I don't have time now.

noam-sol commented 1 year ago

and managed part don't have related code.

@viewizard I think it does? https://github.com/Samsung/netcoredbg/blob/8d9744a6bdee044fd2cc73dd142bf6eba1ad34a1/src/managed/SymbolReader.cs#L795

and on that clause the nested breakpoint is actually added, maybe I can check for sequence point parent-child relations around here?

viewizard commented 1 year ago

I think it does?

I believe, implementation should start from https://github.com/Samsung/netcoredbg/blob/8d9744a6bdee044fd2cc73dd142bf6eba1ad34a1/src/metadata/modules_sources.cpp#L709, since in case breakpoint set to line without code, at fast look, logic should not provide as closestNestedToken method tokens that is part of "arguments".

gbalykov commented 1 year ago

This should be fixed in latest release. Feel free to reopen if you see any more related issues.