eclipse-cdt-cloud / cdt-gdb-adapter

CDT GDB Debug Adapter
Eclipse Public License 2.0
27 stars 40 forks source link

MIParser hangs on nested objects #311

Closed jonahgraham closed 8 months ago

jonahgraham commented 8 months ago

This issue came out of #308

The MI parser hangs when some types of nested objects are encountered. e.g. breakpoint messages with script such as this simplified example:

=breakpoint-modified,bkpt={number="3",type="breakpoint",script={"commands here"}}
jonahgraham commented 8 months ago

What is missing is that an MI Tuple can have just a MI Value, in CDT it is handled like this:

            // Try for the DsfMIValue first
            MIValue value = processMIValue(buffer);
            if (value != null) {

While in cdt-gdb-adapter it is assumed that MI Tuple (called Object in this code) assumes that it is only key value pairs:

https://github.com/eclipse-cdt-cloud/cdt-gdb-adapter/blob/edb0d46effb1297e916458fa7f7cfc91dc8ec8e1/src/MIParser.ts#L164-L173

XingMicrochip commented 8 months ago

Thanks @jonahgraham for explaining the issue. I can look at this issue.

jonahgraham commented 8 months ago

I've been writing some tests for the miparser which should make it easier to develop. See #312, but I have a bigger change pending which I expect I'll push today.

XingMicrochip commented 8 months ago

Are you going to push a change that solves this MIParser bug, or is that change to add test cases for this issue? - i.e. Should I work on fixing this bug?

jonahgraham commented 8 months ago

Just the tests for now.

XingMicrochip commented 8 months ago

I see where the issue is but I do not fully understand now why it happens in handleObject(). Say the example looks like below, can you walk through the code to help me understand which part hangs, to help me develop a fix?

{number="1",type="breakpoint",disp="keep",enabled="y",addr="0x8000150c",func="main",file="C:/Microchip/...",fullname="C:\\Microchip\\...",line="2029",thread-groups=["i1"],times="0",script={"p 123","p 456","p 789"},original-location="main"}
jonahgraham commented 8 months ago

I refer you back to the above comment (https://github.com/eclipse-cdt-cloud/cdt-gdb-adapter/issues/311#issuecomment-1783547806) - the problem is handleObject assumes that the first thing inside the { is a key, but it can directly be a value. The Eclipse CDT case (java code quoted in the above comment) shows how it is handled there.

jonahgraham commented 8 months ago

PS if you step through the code you should see that const name = this.handleString(); in handleObject consumes too much stuff and then (much) later on the code hangs because name consumed a closing } and the outer handleObject keeps trying to read until it matches the closing brace.

So as I see it there are two problems. One is the mis-parsing of this specific case. The other is that AFAICT the code will too easily hang on misformed MI.

XingMicrochip commented 8 months ago

I see. I suppose you have an easy setup you used to debug. How to set it up so I can step through the code? I have not debugged the application itself before.

jonahgraham commented 8 months ago

The easiest way is to run the new test I added in #312. e.g. create a new test like this:


    it('bug311', async function () {
        parser.parseLine('+message,object=***Put your structure here***');
        sinon.assert.calledOnceWithExactly(
            gdbBackendMock.emit as sinon.SinonStub,
            'statusAsync',
            'message',
            ** put your parsed object here **
        );
    });

Then you can run in VSCode (or Theia) according to the steps in "Debugging tips"

XingMicrochip commented 8 months ago

I am debugging inside miparser.spec.ts. Can you provide an example of an object that can be handled correctly by handleObject()? I just want to understand better how this function works.

jonahgraham commented 8 months ago

Can you provide an example of an object that can be handled correctly by handleObject()?

All the other examples in the newly added test, such as:

    it('simple status-async-output', async function () {
        parser.parseLine('+message,object={value="1234"}');
        sinon.assert.calledOnceWithExactly(
            gdbBackendMock.emit as sinon.SinonStub,
            'statusAsync',
            'message',
            {
                object: {
                    value: '1234',
                },
            }
        );
    });
XingMicrochip commented 8 months ago

It seems that the parser parses nested objects fine. If I supply for example bkpt={value1={nested_value1="123",nested_value2="789"},value2="hello"}, the parsed object is

{
  bkpt: {
    value1: { nested_value1: '123', nested_value2: '789' },
    value2: 'hello'
  }
}

which is correct. What confuses me is in example bkpt={number="1",type="breakpoint",thread-groups=["i1"],script={"p 123","p 456","p 789"}}, the value of script - {"p 123","p 456","p 789"} - does not look like a JavaScript/TypeScript object. Objects need to be key/value pairs, but in the script case it is essentially a list of commands each of which should not need a "key". So I am not sure if it should be parsed into an object. But what type should it be parsed into? Is there an API from maybe MI/GDB that decides what type should script be parsed into? Do you have any knowledge of this?

jonahgraham commented 8 months ago

You are correct - the MI that gdb produces here does not seem to match the GDB spec. That happens in various places in GDB and there is special handling in, for example, handleAsyncData (see comment in that method).

The java in Eclipse CDT tries to parse the contents of an object as a value before looking for a name=value pair. I suppose the same thing should happen here?

e.g. can xy={"valA", "valB"} be parsed into:

{
  "xy": ["valA", "valB"]
}

i.e. detect this case and treat it as if it were an array (handleArray)

But I am not sure how it should be represented. However, we don't really need to represent it perfectly as we don't need to process it further inside the adapter. We just want to make sure the parser doesn't fall over it. So we could parse my above example into this without loss of information even:

{
  "xy": {
    "0": "valA",
    "1": "valB"
  }
}
colin-grant-work commented 8 months ago

But I am not sure how it should be represented.

Perhaps as an array? That seems like the usual mapping of the concept of 'tuple' and is basically equivalent to an 'object with number keys.'

jonahgraham commented 8 months ago

Fixed with #313