Haiyang-Sun / nodeprof.js

Instrumentation framework for Node.js compliant to ECMAScript 2020 based on GraalVM.
Apache License 2.0
53 stars 22 forks source link

return/throw callback support #37

Open mwaldrich opened 5 years ago

mwaldrich commented 5 years ago

Is there an ETA for implementation of these callbacks? My team would like to use these in our analysis.

mwaldrich commented 5 years ago

I think it's also important to note that functionExit/invokeFun aren't sufficient for knowing what line of code returns. This is because try/finally blocks can execute code after an expression is returned.

For example, the given analysis used on the given code makes it impossible to tell which line of code triggers the return.

analysis.js:

// DO NOT INSTRUMENT
(function (sandbox) {
    function MyAnalysis() {

        /**
         * These callbacks are called before and after a function, method, or constructor invocation.
         **/
        this.invokeFunPre = function (iid, f, base, args, isConstructor, isMethod, functionIid, functionSid) {
            console.log("invokeFunPre");
        };
        this.invokeFun = function (iid, f, base, args, result, isConstructor, isMethod, functionIid, functionSid) {
            console.log("invokeFun");
        };

        /**
         * These callbacks are called before the execution of a function body starts and after it completes.
         **/
        this.functionEnter = function (iid, f, dis, args) {
            console.log("functionEnter");
        };
        this.functionExit = function (iid, returnVal, wrappedExceptionVal) {
            console.log("functionExit");
        };

        /**
         * These callbacks are called after a variable is read or written.
         **/
        this.read = function (iid, name, val, isGlobal, isScriptLocal) {
            console.log("read " + val);
        };
        this.write = function (iid, name, val, lhs, isGlobal, isScriptLocal) {
            console.log("write " + name + " = " + val);
        };
    }

    sandbox.analysis = new MyAnalysis();
})(J$);

function.js:

let a = 0;

function example(arg1, arg2) {
    try {
        return arg1;
    } finally {
        a = 1;
        arg2;
    }
}

example(10, 20);

mx jalangi --analysis analysis.js function.js:

functionEnter
functionExit
functionEnter
write example = function example(arg1, arg2) {
    try {
        return arg1;
    } finally {
        a = 1;
        arg2;
    }
}
write a = 0
read function example(arg1, arg2) {
    try {
        return arg1;
    } finally {
        a = 1;
        arg2;
    }
}
invokeFunPre
functionEnter
read 10
write a = 1
read 20
functionExit
invokeFun
functionExit

Ideally, listening to the return callback would give us:

functionEnter
functionExit
functionEnter
write example = function example(arg1, arg2) {
    try {
        return arg1;
    } finally {
        a = 1;
        arg2;
    }
}
write a = 0
read function example(arg1, arg2) {
    try {
        return arg1;
    } finally {
        a = 1;
        arg2;
    }
}
invokeFunPre
functionEnter
read 10
return 10 // NEW CALLBACK
write a = 1
read 20
functionExit
invokeFun
functionExit
mwaldrich commented 5 years ago

Any update on this feature?

Haiyang-Sun commented 5 years ago

The ThrowNode has a branch tag, which I can hook in NodeProf. However there is not yet any specific tag for the ReturnNode.

@eleinadani is it possible to add a new return tag for ReturnNode?

eleinadani commented 5 years ago

couldn't you simply intercept all throw/return values via onReturnExceptional and by instrumenting the function roots? One problem with ReturnNode is that it's not used everywhere, as we sometimes store return values in a frame slot for opt

Haiyang-Sun commented 5 years ago

It is about the comment from @mwaldrich above

I think it's also important to note that functionExit/invokeFun aren't sufficient for knowing what line of code returns. This is because try/finally blocks can execute code after an expression is returned.

Existing value returned or thrown do not reveal the line of code which returns or throws.

alexjordan commented 5 years ago

Instrumentation support for return has been merged upstream in https://github.com/graalvm/graaljs/commit/e4bd649456362e66243b048cdb07993222c85696

mwaldrich commented 5 years ago

Thanks for the implementation! I'm starting to use this in my analysis now.

I noticed this callback doesn't provide the function that the code is returning from. While it is possible for me to maintain a call stack in my analysis, it would be much more efficient to re-use the call stack in Graal.

In addition, this change would make the _return callback consistent with the other function-related callbacks (invokeFun, functionEnter), since they pass the function as an argument as well.

Haiyang-Sun commented 5 years ago

@mwaldrich this is doable, I can add support for this.

alexjordan commented 5 years ago

I would advise against this: executing a return statement does not imply that the function indeed exits (functionExit does). _return is more closely related to control-flow than function, so adding the containing function is misleading IMO.

Consider this example:

 try {
   throw Error("with finally");
   return 'nope';
 } catch(e) {
   return 42;
 } finally {
   return 43;
 }

The _return callback will trigger like this:

> return (src/ch.usi.inf.nodeprof.test/js/minitests/exitException.js:53:4:53:14) val returns 42
> return (src/ch.usi.inf.nodeprof.test/js/minitests/exitException.js:55:4:55:14) val returns 43
mwaldrich commented 5 years ago

Yes, it wouldn't make sense to model a callstack using the _return callback, because it does not signal the exiting of a function. But since I am trying to avoid modeling the callstack altogether, I wasn't trying to use _return for this.

I wanted the function in the _return callback just so I could associate function invocations with the values they returned.

If I wanted to do this with the current callback, I would need to model a callstack in my analysis (using functionEnter and functionExit), and associate the _return callback with the function on the top of the stack. If the _return callback just gave the surrounding function as an argument, it would eliminate the need for me to model the callstack.

However, I could see how people might assume that _return does always signal the exit of a function. Maybe we could specify this in the documentation of the _return callback?