svaarala / duktape

Duktape - embeddable Javascript engine with a focus on portability and compact footprint
MIT License
5.96k stars 516 forks source link

Provide a way to obtain captured scopes variables names in debug protocol #601

Open harold-b opened 8 years ago

harold-b commented 8 years ago

As was briefly discussed in #567.

A way to obtain at least the names of the variables captured from the outer scopes of a function on the call stack, via the debug client.

harold-b commented 8 years ago

I've been stepping through the way duk does function calls in an attempt to see if I can hook up some makeshift way of getting the outer scope var names on my local copy of the repo for the time being.

Am I following the right track by checking duk_js_getvar_activation ?

You mentioned elsewhere that some environments get delayed in getting created. But if I want to get the list of names from any call on the callstack while paused in the debugger, that means that they should all have a guaranteed enviro as long as they are ecmascript funcs, right?

Could you point me to the right direction?

fatcerberus commented 8 years ago

By the way, if you just want the locals from an arbitrary callstack entry, the GetLocals request can accept a callstack index as of 1.4.0. Accessing the variables inherited from outer scopes (i.e. closures) will likely require environment record access though, you're right.

harold-b commented 8 years ago

Yeah, I've got the locals, but I'm so eager on getting my hands on those closure keys!

harold-b commented 8 years ago

@svaarala I'm having a hard time figuring out how to enumerate the hash part of an object... How are the keys stored in the hash part?

harold-b commented 8 years ago

Nevermind, I had read the code incorrectly, got it now.

fatcerberus commented 8 years ago

@harold-b If anything comes of this, let me know, this would be incredibly useful for SSJ. :smile:

harold-b commented 8 years ago

Will do. :) I managed to get something going. I can enumerate the keys of the closures, but the global one has all kinds of keys, so I'm wondering if there's a way to filter out 'system' keys on the global object.

There's also some unimplemented which are cases that aren't being hit yet, but so far so good!

harold-b commented 8 years ago

Here's what I've got so far... I'm getting the external vars but once I get to the global obj, it's a mess :scream:

svaarala commented 8 years ago

Cool :) There's actually not that big of a difference between a global property and a global variable. You could maybe use "enumerable" attribute for filtering:

// Most built-ins are not enumerable:
duk> Duktape.enc('jx', Object.getOwnPropertyDescriptor(this, 'print'), null, 4)
= {
    value: {_func:true},
    writable: true,
    enumerable: false,
    configurable: true
}

// Global variables declared using 'var' are enumerable but not configurable:
duk> var foo = 123;
= undefined
duk> Duktape.enc('jx', Object.getOwnPropertyDescriptor(this, 'foo'), null, 4)
= {
    value: 123,
    writable: true,
    enumerable: true,
    configurable: false
}

// Global variables created by single assignment are enumerable and configurable:
duk> bar = 234;
= 234
duk> Duktape.enc('jx', Object.getOwnPropertyDescriptor(this, 'bar'), null, 4)
= {
    value: 234,
    writable: true,
    enumerable: true,
    configurable: true
}
svaarala commented 8 years ago

To clarify the above a bit, once a global variable or property has been created, Duktape doesn't "remember" whether it's a variable or a property. For example, it's possible to create global object properties using Object.defineProperty() which will be identical to properties established using var x = ....

But user code also lacks this distinction, and it's by design in Ecmascript, so it's a real phenomenon rather than a side effect of how things are currently implemented :)

harold-b commented 8 years ago

That's awesome! thanks a lot. I messed with it and filtered some for now based on some native flag. But, just for testing purposes. While debugging the debugger, I saw that the nodeJS debugger in VS Code displays all globals as I received them from duktape... So I think I'm gonna leave it like that as well. Or rather make it optional. The front end gives asks you if a specific closure is heavy, so it should work out.

It's working great! I just need to ask you some questions on the native side to verify it's being done correctly. Right now I'm not inspecting proxy objects which if I read your comments correctly, must be supported for Object environment records.

svaarala commented 8 years ago

Sounds good, and it's awesome you just went ahead and started hacking :) There are a lot small details to account for when doing internal stuff, I'll be happy to look them over.

For property reads object environment records trigger two things which may need special handling:

Well, there are other small issues too. For example, an object binding is bound to an object which has a looping prototype chain. Any variable lookups should deal with that properly, perhaps either forcing the debug client to walk that chain (as we do for object inspection) or having a sanity limit.

Regarding getters, the current limitation is that you can e.g. read get/set properties using the InspectHeapObject command and they won't be invoked as part of that, but there's no "run this getter" debugger command right now. This is tracked in #586.

svaarala commented 8 years ago

@harold-b So is your debugger functional now? :)

harold-b commented 8 years ago

@harold-b So is your debugger functional now? :)

Yes! check it out:

svaarala commented 8 years ago

Would it be OK to list in http://wiki.duktape.org/DebugClients.html? :)

harold-b commented 8 years ago

I still gotta rework a few things to use the InspectHeapObj way. But it's mostly there.

Would it be OK to list in http://wiki.duktape.org/DebugClients.html? :)

Sure be my guest :)

harold-b commented 8 years ago

I'm reading what you posted about the Proxy objects, I don't know anything about them yet. So thanks for all that info. Since I'm only looking for keys, will querying the keys of a Proxy also end up in a getter?

svaarala commented 8 years ago

Ok, so just a few question about that (just rechecking everything):

Screenshot links :)

svaarala commented 8 years ago

@harold-b Proxy will be invoked if you do e.g. duk_get_prop(). If you're walking through property tables manually proxies won't get involved.

There's currently no primitive, not even an internal one, for "get property without invoking proxy and/or getter". This would be quite useful but haven't had the time to work that in yet :)

harold-b commented 8 years ago

It's closed for now, but I'm most likely gonna open source it and release it as vs code extension after a little dog fooding. ( I just barely started it and put it together this last week )

I'm using the binary protocol only, no JSON.

I am using plain TCP, yes.

As for a description, it's just a VS Code debugger extension for duktape runtimes.

*Oh and I'll have to get back to you on the screenshots, let me finish hooking up the deep object inspection again and I'll take some tomorrow.

svaarala commented 8 years ago

Is there a project / client name I could use for the entry?

harold-b commented 8 years ago

The main purpose for it was to use it to develop games and debug remotely on mobile clients, I'll use it with my own little framework, called Musashi. But it's not a public project, just a personal game framework.

svaarala commented 8 years ago

Hmm, so there's no project link or home page or such. Would it be OK if I listed it as:

It's really up to you, and it's fine if listing it doesn't make sense at this point :) I mainly wanted for people looking at the debug clients to see that there's already a VS Code integration.

BTW, there's another project for VS Code too: https://github.com/vktr/vscode-duk-debug. That's using the JSON proxy as far as I know but it's not yet functional.

harold-b commented 8 years ago

Ha, had no idea someone else had already done some work on one for VS Code.

Feel free to put it up now or hold off, I don't mind either way. I suppose there's not any public info right now. As I mentioned I literally just wrote it over this week.

I could take a preliminary screenshot now if what you want to show is that there's a working VS Code implementation.

harold-b commented 8 years ago

Actually, let me get back to you tomorrow. I'm about to step away for the day in a moment.

svaarala commented 8 years ago

It's quite encouraging you did it in a week by the way :) That was roughly the design goal originally, more specifically:

I'm listing clients partially because the whole design is geared towards mixing and matching debuggers and targets, which is one key reason why the debug protocol is terminated by Duktape. The other major design direction would have been a debug API, which then requires you to design your own protocol, but that would have resulted in a very fragmented debugger landscape so I wanted to avoid that quite a bit.

svaarala commented 8 years ago

Sure, talk to you later :)

harold-b commented 8 years ago

It's quite encouraging you did it in a week by the way :)

Yeah, that's because someone's documentation is thorough and top notch! I think you quite succeeded in your goals. Thanks again for the awesome library, I'm really eager to start using it.

Talk to you soon!

fatcerberus commented 8 years ago

One should be able to implement a debug client "in weeks" rather than months. The binary protocol has been a big obstacle for this which is why there's a debug proxy which I'd recommended initially. Working with the binary protocol outside of C is a bit easier though.

I've found the opposite to be true: It's much easier to work with binary in C because the language is already so low-level. I just threw all the dvalue parsing into one file (note how simple dvalue_send() and dvalue_recv() are): https://github.com/fatcerberus/minisphere/blob/master/src/ssj/dvalue.c

Everything else is nice and high-level. You've already seen my communication code. :)

But yes, SSJ development started on Jan 9 and it's already very useful as an everyday debugger. The Duktape debug setup is surprisingly powerful for something so simple (and infinitely moreso now that AppRequest is in ;)!

svaarala commented 8 years ago

Agree there, if one writes a debug client in C then JSON is probably more work than a binary protocol. The optimal binary protocol for reading/writing in C would probably be much more struct oriented than the current dvalue protocol though: explicit type fields, explicit length fields, etc.

Most of the difficulty I've seen in C debug clients has been related to incorrect parsing assumptions, e.g.: assuming that once you get a length field you can read all the data, or that if a dvalue has multiple fields, that all of them will be available at once, assuming short forms on receive when longer forms are also allowed, etc. These are quite common issues in other protocols too, and are easier to avoid with explicit length fields and such.

Anyway, most debug clients are written in high level languages than C though, and the JSON alternative is intended for those. Interacting with JSON in Javascript is definitely easier than interacting with dvalues :)

harold-b commented 8 years ago

Hey @svaarala, Let me get back to you during the week on the screenshots etc. I decided to simply upload the repo early, and then you can simply link to it, without any ambiguity. I just need to refactor some code still before the initial upload. I'll let you know sometime this week.

svaarala commented 8 years ago

Sounds good :)

harold-b commented 8 years ago

I have a question about walking the record chain... What I'm doing currently is mimicking the path taken by the "DUK_OP_GETVAR" instruction in order to enumerate the variables captured by a function. It's working pretty well, but some vars are not showing up.

Please have a look at the red rectangles :

Any variable declared after the anonymous function is not showing up in the closure scope... Off the top of your head, do you know what I might be omitting that is not enumerating that var?

svaarala commented 8 years ago

It should be "hoisted" normally - could you check what the converted Javascript file looks like?

harold-b commented 8 years ago

Sorry about that, I had to step out for a bit.

Here's the transpiled js code:

 Game.prototype.update = function update() {
        var foo = "FOO";
        var bar = "BAR";
        var hello = "HELLO";
        var game = _game;
        var f = function () {
            var z = a;
            var fooz = "FOOZ";
            var barz = bar;
            var helloz = hello;
        };
        var a = "hi there.";
        f();
        if (this._paused)
            return;
        var x = _x + 1;
        _x = x;
        var oldTick = this._tick;
        this._tick++;
        this.log("ticked");
    };
svaarala commented 8 years ago

Ok, so "z" will become "hi there" but you don't see "a" in the environment object?

harold-b commented 8 years ago

Yes, it does become "hi there". ( I should have stepped once more in the screenshot... ) Everything works correctly on the JS side. I'm just missing those vars when enumerating the vars from the env records.

There's one path that I didn't currently implement, though, which is the aforementioned proxy objects.

svaarala commented 8 years ago

That shouldn't be needed here. Could you show how you're walking the environment record object?

harold-b commented 8 years ago

Yup, give me a sec.

harold-b commented 8 years ago

First it checks the registers

/// See: duk__get_identifier_reference for reference

    env = act->lex_env;

    if( env == NULL )
    {
        func = DUK_ACT_GET_FUNC( act );
        DUK_ASSERT( func != NULL );
        DUK_ASSERT( DUK_HOBJECT_HAS_NEWENV( func ) );
        DUK_ASSERT( DUK_HOBJECT_IS_COMPILEDFUNCTION( func ) );

        tv = duk_hobject_find_existing_entry_tval_ptr( thr->heap, func,
                DUK_HTHREAD_STRING_INT_VARMAP( thr ) );
        DUK_ASSERT( tv );
        if( !tv )
            return;

        DUK_ASSERT( DUK_TVAL_IS_OBJECT( tv ) );
        varmap = DUK_TVAL_GET_OBJECT( tv );
        DUK_ASSERT( varmap != NULL );

        duk__debug_dump_obj_keys(thr, heap, varmap);
        duk_debug_write_int(thr, 0);    /* scope end marker */

        tv = duk_hobject_find_existing_entry_tval_ptr( thr->heap, func, DUK_HTHREAD_STRING_INT_LEXENV( thr ) );
        if( tv ) {
            DUK_ASSERT( DUK_TVAL_IS_OBJECT( tv ) );
            env = DUK_TVAL_GET_OBJECT( tv );
        }
        else {
            DUK_ASSERT( duk_hobject_find_existing_entry_tval_ptr( thr->heap, func, DUK_HTHREAD_STRING_INT_VARENV( thr ) ) == NULL );
            env = thr->builtins[DUK_BIDX_GLOBAL_ENV];
        }
    }

    *penv = env;

Then it walks the proto chain:

duk_uint_t sanity;  

    /// Enumerate all keys find along the prototype chain
    sanity = DUK_HOBJECT_PROTOTYPE_CHAIN_SANITY;
    while (env != NULL) {
        duk_small_int_t cl;
        duk_int_t attrs;

        DUK_ASSERT(DUK_HOBJECT_IS_ENV(env));
        DUK_ASSERT(!DUK_HOBJECT_HAS_ARRAY_PART(env));

        cl = DUK_HOBJECT_GET_CLASS_NUMBER(env);
        DUK_ASSERT(cl == DUK_HOBJECT_CLASS_OBJENV || cl == DUK_HOBJECT_CLASS_DECENV);
        if (cl == DUK_HOBJECT_CLASS_DECENV) {
            /*
             *  Declarative environment record.
             *
             *  Identifiers can never be stored in ancestors and are
             *  always plain values, so we can use an internal helper
             *  and access the value directly with an duk_tval ptr.
             *
             *  A closed environment is only indicated by it missing
             *  the "book-keeping" properties required for accessing
             *  register-bound variables.
             */

            while ( !DUK_HOBJECT_HAS_ENVRECCLOSED(env)) {

                /// Reference: duk__getid_open_decl_env_regs

                /* make sure it's open */
                duk_hobject *env_func;
                duk_hobject *varmap;
                duk_tval *tv;

                DUK_ASSERT( DUK_HOBJECT_IS_DECENV( env ) );

                tv = duk_hobject_find_existing_entry_tval_ptr( thr->heap, env, DUK_HTHREAD_STRING_INT_CALLEE( thr ) );
                if( !tv ) {
                    /* env is closed, should be missing _Callee, _Thread, _Regbase */
                    break;
                }

                    DUK_ASSERT(DUK_TVAL_IS_OBJECT(tv));
                    DUK_ASSERT(DUK_TVAL_GET_OBJECT(tv) != NULL);
                    DUK_ASSERT(DUK_HOBJECT_IS_COMPILEDFUNCTION(DUK_TVAL_GET_OBJECT(tv)));
                env_func = DUK_TVAL_GET_OBJECT(tv);
                    DUK_ASSERT(env_func != NULL);

                tv = duk_hobject_find_existing_entry_tval_ptr(thr->heap, env_func, DUK_HTHREAD_STRING_INT_VARMAP(thr));
                if (!tv) {
                    break;
                }

                    DUK_ASSERT(DUK_TVAL_IS_OBJECT(tv));
                varmap = DUK_TVAL_GET_OBJECT(tv);
                    DUK_ASSERT(varmap != NULL);

                duk_debug_write_int(thr, 0);    /* scope start marker */
                duk__debug_dump_obj_keys(thr, heap, varmap);
                duk_debug_write_int(thr, 0);    /* scope end marker */

                break;
                /* already closed */
            }
        } else {
            /*
             *  Object environment record.
             *
             *  Binding (target) object is an external, uncontrolled object.
             *  Identifier may be bound in an ancestor property, and may be
             *  an accessor.  Target can also be a Proxy which we must support
             *  here.
             */

            /* XXX: we could save space by using _Target OR _This.  If _Target, assume
             * this binding is undefined.  If _This, assumes this binding is _This, and
             * target is also _This.  One property would then be enough.
             */
            duk_tval *tv_target;
            duk_hobject *target;
            duk_bool_t found;

                DUK_ASSERT(cl == DUK_HOBJECT_CLASS_OBJENV);

            tv_target = duk_hobject_find_existing_entry_tval_ptr(thr->heap, env, DUK_HTHREAD_STRING_INT_TARGET(thr));
                DUK_ASSERT(tv_target != NULL);
                DUK_ASSERT(DUK_TVAL_IS_OBJECT(tv_target));
            target = DUK_TVAL_GET_OBJECT(tv_target);
                DUK_ASSERT(target != NULL);

            /* Target may be a Proxy or property may be an accessor, so we must
             * use an actual, Proxy-aware hasprop check here.
             *
             * out->holder is NOT set to the actual duk_hobject where the
             * property is found, but rather the object binding target object.
             */

            if (DUK_HOBJECT_HAS_EXOTIC_PROXYOBJ(target)) {

                /// TODO: This
                DUK_ASSERT( 0 );

                //found = duk_hobject_hasprop(thr, tv_target, &tv_name);
            } else {
                /* XXX: duk_hobject_hasprop() would be correct for
                 * non-Proxy objects too, but it is about ~20-25%
                 * slower at present so separate code paths for
                 * Proxy and non-Proxy now.
                 */
                duk__debug_dump_obj_keys(thr, heap, target );
                duk_debug_write_int(thr, 0);    /* scope end marker */
            }

        }

        if (sanity-- == 0) {
            DUK_ERROR(thr, DUK_ERR_INTERNAL_ERROR, DUK_STR_PROTOTYPE_CHAIN_LIMIT);
        }
        /* go up the hierarchy */
        env = DUK_HOBJECT_GET_PROTOTYPE(thr->heap, env);

And duk__debug_dump_obj_keys looks like this:

DUK_LOCAL void duk__debug_dump_obj_keys(duk_hthread *thr, duk_heap *heap, duk_hobject *obj) {

    if( DUK_LIKELY( DUK_HOBJECT_GET_HSIZE(obj) == 0 ) )
    {
        /* Linear scan: more likely because most objects are small.
        * This is an important fast path.
        *
        * XXX: this might be worth inlining for property lookups.
        */
        duk_uint_fast32_t i;
        duk_uint_fast32_t n;
        duk_hstring **h_keys_base;
        DUK_DDD( DUK_DDDPRINT( "duk_hobject_find_existing_entry() using linear scan for lookup" ) );

        h_keys_base = DUK_HOBJECT_E_GET_KEY_BASE(heap, obj);
        n = DUK_HOBJECT_GET_ENEXT( obj );
        for( i = 0; i < n; i++ ) {
            duk_debug_write_hstring( thr, h_keys_base[i] );
        }
    }
    else
    {         
        /* hash lookup */
        duk_uint32_t i, n;
        duk_uint_t e_idx, h_idx;
        duk_uint32_t *h_base;
        duk_hstring* key;
        duk_tval* tv;
        duk_int_t flags;

        h_base = DUK_HOBJECT_H_GET_BASE(heap, obj);
        n = DUK_HOBJECT_GET_HSIZE(obj);

        for( i = 0; i < n; i++ ) {
            DUK_ASSERT( i < DUK_HOBJECT_GET_HSIZE( obj ) );
            duk_uint_t h_idx = DUK_HOBJECT_H_GET_INDEX( heap, obj, i );

            if( h_idx == DUK_HOBJECT_HASHIDX_UNUSED ) {

            }
            else if( h_idx == DUK_HOBJECT_HASHIDX_DELETED ) {

            }
            else {
                e_idx = h_base[i];
                key = DUK_HOBJECT_E_GET_KEY(heap, obj, h_base[i]);

                tv = DUK_HOBJECT_E_GET_VALUE_TVAL_PTR(heap, obj, e_idx);
                flags = DUK_HOBJECT_E_GET_FLAGS(heap, obj, e_idx);

                duk_debug_write_hstring(thr, key);
            }
        }

    }
}

As you can see it's mostly me extracting your code from other parts and bastardizing it...

svaarala commented 8 years ago

That hash part is not needed at all by the way: you can always just enumerate the keys from the entry part. The hash part is an optional, redundant structure for large objects.

That enumeration code doesn't look wrong, though you should skip NULL keys (which are possible in the middle of the entry part). Could you debug print what that loop sees when you inspect the closure?

harold-b commented 8 years ago

OK give me moment.

harold-b commented 8 years ago

... Nevermind... It was a bug on the message translator for that message type on the client side.... :runner:

It was reporting them all properly, it just so happens the bug made it stop on the var right after the function.

harold-b commented 8 years ago

But now that we're talking about this, you mentioned skipping the hash part... How do I enumerate from the entry part?

svaarala commented 8 years ago

The first if block (when h_size == 0) already does that.

harold-b commented 8 years ago

Oh, well it was failing on the global object

svaarala commented 8 years ago

Did you skip NULL keys?

harold-b commented 8 years ago

I guess I had the Hash option enabled by defualt. So I added the hash part.

harold-b commented 8 years ago

Yes I just added the skip now. But it was correctly reporting all of them. The bug was on the debugger client! I was reading that message wrong.

svaarala commented 8 years ago

Ok good :) But with the NULL key check, do you still have an issue with the global object?