Add a way to enumerate references held by "lambda-like" functions?

HouQiming commented 9 years ago

When debugging memory leaks, currently we can almost dump the entire duktape heap by calling duk_enum recursively. However, it seems that duk_enum cannot enumerate into the parent frames held by function objects. Is it possible to provide a way to dump that? Or alternatively, is it possible to provide a dump of the GC sweeping process that provides some information of non-collected objects?

Why I'm requesting this

Recently I'm debugging a memory leak of my (clumsily written) duktape-enabled program. The situation looks like this:

var g_menu;
var RenderFrame=function(){
    var bkmenu=g_menu;
    g_menu=new Menu();
    /////////////////////////////////
    g_menu.callbacks=[]
    for(var i=0;i<g_menu.n;i++){
        var obj=CreateSomeotherObject();
        var fOnHotkeyX=function(){
            obj.DoSomething();
        }
        g_menu.callbacks.push(fOnHotkeyX);
    }
};

for(;;){
    RenderFrame();
    Duktape.gc();
    ReadMyOwnProcessMemoryAndDumpIt();
}

From the brute force memory dump generated after calling Duktape.gc(), I'm seeing thousands of Menu-related objects with non-zero reference count. It seems that after repeatedly calling RenderFrame, there is an infinite reference chain:

g_window -> Menu -> fOnHotkeyX -> RenderFrame -> bkmenu -> Menu -> ...

I'm not sure how duktape currently handles local frames and I'm not sure whether this is indeed causing the memory leak I'm debugging. However, I'd really prefer if I could find out this without reverse engineering the memory dump to the relevant duk_hobject structs, e.g., if I could dump a list of references held by fOnHotkeyX.

svaarala commented 9 years ago

There's currently no way to enumerate the parent frames using duk_enum() because the references are internal. But you can dump the whole heap using the debugger DumpHeap command. You can either implement a debug client of your own (not trivial) or use the existing web UI, in which case you can get a JSON dump of the heap for inspection.

Scope handling is rather simplistic right now, and inner functions keep a reference to the outer function scope. This currently includes all variables of the outer scope, not just those referenced by the inner function which (if unexpected) may look like a leak. There's an open issue tracking an improvement: #225.

In the meanwhile, a working but of course not preferred workaround is not assign "null" to the outer function variables not needed by inner functions before the outer function exits.

HouQiming commented 9 years ago

I'm embedding the big duktape.c in a bigger program. Is it possible to set up the web UI in that case? My scripts won't run in the command line tool.

Meanwhile, I'll go for the assign-null workaround. Once clarified, the current scope handling is quite intuitive (the previous interpreter I use has the same issue). I'd just have to be more careful with inner functions. Just to make it more clear, would any temp register "linger" in the scope after the outer function returns?

svaarala commented 9 years ago

For the debugger to work you need to implement a debug transport which is basically a two-way byte stream with no understanding what goes inside it. The "duk" command uses a TCP transport, for example. Once that's in place, the existing debugger web UI should be able to connect to the target.

When the outer function exits, values from value stack registers are copied into an explicit scope object where a variable name is mapped to a value; it's similar to a "with" binding. Temporary registers won't be copied so they'll be reliably released on outer function exit.

HouQiming commented 9 years ago

Got it. Guess I'll have to make a Windows port of duk_trans_socket.c the next time I debug a memory leak...

Good to know that only named variables are held in the scope.

svaarala commented 9 years ago

The necessary Windows changes are actually sitting a pull request because the pull breaks Linux compilation etc. I'll try to see if I can cherry pick them into master for 1.4.0 release.

HouQiming commented 9 years ago

Oh. If there is actually going to be a Windows port in 1.4.0, is it possible to add a flag in duk_config.h to auto-attach the TCP transport whenever you create a heap? That would be much more convenient for people who don't want to customize the debugging transport.

svaarala commented 9 years ago

The transport itself is outside of Duktape so there's unfortunately no control flow to do that. The debug transport is attached using duk_debugger_attach() which must be called from the application.

HouQiming commented 9 years ago

Read #225. So the debugger is interactive and can actually pause duktape.

Maybe I could write a dummy transport layer that attaches, sends a DumpHeap command, prints the result then detaches? Is this possible? Would I need to call something in-between to make duktape read the command?

svaarala commented 9 years ago

Going back a little bit, integrating with the debugger support might be somewhat of an overkill if you're just preparing to resolve future memory usage issues :)

But yes, you can attach a virtual transport locally and implement a debug client in the same process. There's an example transport which does that; it decodes the debugger protocol "dvalues" and allows locally running code to talk with the debugger:

https://github.com/svaarala/duktape/tree/master/examples/debug-trans-dvalue

You're still "speaking" the debugger protocol though, so it's not a trivial integration if you just want to walk the scope objects.

svaarala commented 9 years ago

Forgot to answer: you can make the Duktape debugger code "pull in" the request by calling duk_debugger_cooperate() once you're attached. That API call is intended for event loops and such, so that debugger commands can be processed in a single thread without blocking.

HouQiming commented 9 years ago

That does sound a bit daunting. Guess I'll stick to my alloc_func instrumentation for now.

Is there any plan to expose struct duk_hobject*-related APIs in a later version? Right now I'm dumping the leaked-object internals using hardcoded offsets, but it would nice if I could pass it back to JS and analyze it there.

svaarala commented 9 years ago

Exposing the internals would be technically easy - but it would have the huge downside that they would become part of the API compatibility guarantee and very difficult to change as needed. For example, I'm working on some low memory stuff right now which probably causes a change how a function instance's external scope reference is represented. This change would need to wait for a major version bump if the current details were exposed.

So the short answer is unfortunately no. For scopes specifically, having the ability to manipulate the scope of function objects may be needed for other reasons (e.g. #424), so an API which would have helped with this issue specifically might come out of that. But there can of course be bugs / apparent memory leaks which are unrelated to scopes.

HouQiming commented 9 years ago

I see. It would be great if we could access those external scopes in the long run.

svaarala / duktape

Add a way to enumerate references held by "lambda-like" functions? #436

Why I'm requesting this