lantiga / ki

lisp + mori, sweet.js
ki-lang.org
MIT License
478 stars 28 forks source link

Source map accuracy #14

Open fasterthanlime opened 9 years ago

fasterthanlime commented 9 years ago

This may be a sweet.js issue but since I've only encoutered it with the ki compiler so far, I'd like to track it here if it's okay with you @lantiga.

Here's a simple test case: https://gist.github.com/fasterthanlime/4c348adeca79370bf58c

And here's the stack trace Chrome gives:

screen shot 2014-11-04 at 18 21 59

The file/line number is correct for problem4, problem3, problem2, but for problem and the top-level scope, they're completely wrong (863, 530).

It might not be a huge problem, it just means top-level code should always be wrapped in a function to get correct stack traces, but I was wondering if it's something the ki compiler do that throws off sweet.js source map generation.

fasterthanlime commented 9 years ago

Testing in Firefox 33.0.2 gives a similar stack trace:

screen shot 2014-11-04 at 18 25 06

fasterthanlime commented 9 years ago

Opera 24+ gives the exact same results as Chrome, unsurprisingly, and Safari does, too:

screen shot 2014-11-04 at 18 27 40

fasterthanlime commented 9 years ago

I think it does have something to do with the code structure the ki compiler generates. If the topmost function in the call stack is defined in vanilla JS, the stack trace is correct:

screen shot 2014-11-04 at 19 07 22

fasterthanlime commented 9 years ago

Throwing exceptions by hand isn't a problem, as one can just wrap it in a _throw function call to get a working stack trace. However, real exceptions thrown from other sources, such as calling a nil value, still have the "topmost stack trace item is nonsense". Running sample code through https://github.com/gfx/source-map-inspector gives the following:

screen shot 2014-11-04 at 19 22 37

(Removed ki require core for readability - also, source-map-inspector uses an old version of the source-map package that requires the 'file' field to be present so I had to add that by hand.)

Now I'm not source map expert but that does look pretty good... not sure why the stack trace isn't better than this.

Edit: another interesting tool to decode source maps is http://murzwin.com/base64vlq.html

lantiga commented 9 years ago

I've reproduced the issues. While the errors in the global scope could be related to the way the script passes code to sweet in ki.js, I can't understand is why problem is off. There's really nothing special in the way problem is treated at the ki level, the only difference with problem2, 3, etc, is that it returns an anonymous function.

I don't know a lot about source maps, the best thing to do would be to have a minimal sweet.js macro that reproduces the issue without all the ki baggage.

fasterthanlime commented 9 years ago

Alright. I've been reading up more on sweet.js to be able to understand and help with ki. I'll try writing the minimal set of macro/rules required to reproduce the issues without the ki baggage.

fasterthanlime commented 9 years ago

Here's a test case that makes it obvious what the real problem is:

screen shot 2014-11-05 at 17 54 10

Generated source maps refer to the macro definitions. Since in the case of ki, they are precompiled and passed as modules to the sweet.js compiler, it generates nonsense. Code in screenshot

fasterthanlime commented 9 years ago

Including ki's macros/index.js verbatim in a test ki program and compiling via sjs gives a similar result than the simple test above:

screen shot 2014-11-05 at 17 59 36

fasterthanlime commented 9 years ago

@lantiga I managed to isolate the problem, and opened https://github.com/mozilla/sweet.js/issues/411 to track it.

lantiga commented 9 years ago

Yes, that is definitely it, great job.

It's not entirely clear to me what should happen here. Optionally suppress the stack trace that goes into the module?

A workaround for now could be to prepend the ki macro code to the source code in ki.js and avoid using modules altogether - of course only when one specifies the -s option, so not in production. This way the stack traces would make sense. What do you think?

fasterthanlime commented 9 years ago

It's not entirely clear to me what should happen here. Optionally suppress the stack trace that goes into the module?

I think, when the code is macro-expanded using definitions from a module, the mapping should point to where the macro is used, not where it's defined. For ki in particular, when there's a problem in my code, I'd rather see a stack trace in my code, not in ki's definition of the throw _sexpr!

When macros are inline, the current behavior of sweet.js is acceptable. I'm not sure that it's a good idea to introduce a workaround in ki at this stage, let's maybe wait and see how hard the sweet.js bug is to fix?

lantiga commented 9 years ago

It's not entirely clear to me what should happen here. Optionally suppress the stack trace that goes into the module?

I think, when the code is macro-expanded using definitions from a module, the mapping should point to where the macro is used, not where it's defined. For ki in particular, when there's a problem in my code, I'd rather see a stack trace in my code, not in ki's definition of the throw _sexpr!

Definitely, I agree. Any source map should stop right before modules. My guess is that sweet is concatenating modules to code right before macroexpansion. I'll take a look as well.

When macros are inline, the current behavior of sweet.js is acceptable. I'm not sure that it's a good idea to introduce a workaround in ki at this stage, let's maybe wait and see how hard the sweet.js bug is to fix?

Sounds good to me.

fasterthanlime commented 9 years ago

My guess is that sweet is concatenating modules to code right before macroexpansion.

I don't think so. The line numbers pointed at by the source map do exist and they're simply from the module file. If you take a look at the 2nd example on my report: https://gist.github.com/fasterthanlime/7fddcf459a7352fdb324#file-smacro-sjs-L13 - you'll see that the "line 13" part of the stack trace is correct. Just from the wrong file.

lantiga commented 9 years ago

You're right again, now it makes sense.

fasterthanlime commented 9 years ago

Dumping the AST shows this:

screen shot 2014-11-05 at 22 21 12

As you can see, the throw statement is wrongly located (line 13), but its argument is correct (line 2).

Now to figure out exactly what leads to that.

Also, noting that sweet.js only retains line/column, not file/module.

(In case you're wondering, the above is a screenshot from the very handy http://jsonviewer.stack.hu/ )

lantiga commented 9 years ago

Should macros in modules be referenced in sourceMaps at all? All a source map should do is map the pre-expansion code to the post-expansion code.

So adding file/module information might not be what's required in this particular case.

fasterthanlime commented 9 years ago

@lantiga I agree. I've just stumbled upon sweet.js's "debug.js" file that allows stepping through the compilation process, hopefully I can gain a better understanding of how modules are handled in sweet.js this way.

fasterthanlime commented 9 years ago

My guess is that sweet is concatenating modules to code right before macroexpansion.

Actually it does, but on an AST level, not a source level:

    // fun (Str, {}) -> AST
    function parse(code, modules, options) {
        if (code === "") {
            // old version of esprima doesn't play nice with the empty string
            // and loc/range info so until we can upgrade hack in a single space
            code = " ";
        }

        modules = modules ? loadedMacros.concat(modules) : modules;
        return parser.parse(expand(code, modules, options));
    }

:koala: Code from sweet.js

fasterthanlime commented 9 years ago

Right now, seems to be a feature, not a bug, as adjustLineContext seems to show:

 function adjustLineContext(stx, original, current) {
        // short circuit when the array is empty;
        if (stx.length === 0) {
            return stx;
        }

        current = current || {
            lastLineNumber: stx[0].token.lineNumber || stx[0].token.startLineNumber,
            lineNumber: original.token.lineNumber
        };

        return _.map(stx, function(stx) {
            if (stx.token.type === parser.Token.Delimiter) {
                // handle tokens with missing line info
                stx.token.startLineNumber = typeof stx.token.startLineNumber == 'undefined'
                                                ? original.token.lineNumber
                                                : stx.token.startLineNumber
                stx.token.endLineNumber = typeof stx.token.endLineNumber == 'undefined'
                                                ? original.token.lineNumber
                                                : stx.token.endLineNumber
                stx.token.startLineStart = typeof stx.token.startLineStart == 'undefined'
                                                ? original.token.lineStart
                                                : stx.token.startLineStart
                stx.token.endLineStart = typeof stx.token.endLineStart == 'undefined'
                                                ? original.token.lineStart
                                                : stx.token.endLineStart
                stx.token.startRange = typeof stx.token.startRange == 'undefined'
                                                ? original.token.range
                                                : stx.token.startRange
                stx.token.endRange = typeof stx.token.endRange == 'undefined'
                                                ? original.token.range
                                                : stx.token.endRange

                stx.token.sm_startLineNumber = typeof stx.token.sm_startLineNumber == 'undefined'
                                                ? stx.token.startLineNumber
                                                : stx.token.sm_startLineNumber;
                stx.token.sm_endLineNumber = typeof stx.token.sm_endLineNumber == 'undefined'
                                                ? stx.token.endLineNumber
                                                : stx.token.sm_endLineNumber;
                stx.token.sm_startLineStart = typeof stx.token.sm_startLineStart == 'undefined'
                                                ?  stx.token.startLineStart
                                                : stx.token.sm_startLineStart;
                stx.token.sm_endLineStart = typeof stx.token.sm_endLineStart == 'undefined'
                                                ? stx.token.endLineStart
                                                : stx.token.sm_endLineStart;
                stx.token.sm_startRange = typeof stx.token.sm_startRange == 'undefined'
                                                ? stx.token.startRange
                                                : stx.token.sm_startRange;
                stx.token.sm_endRange = typeof stx.token.sm_endRange == 'undefined'
                                                ? stx.token.endRange
                                                : stx.token.sm_endRange;

                if (stx.token.startLineNumber !== current.lineNumber) {
                    if (stx.token.startLineNumber !== current.lastLineNumber) {
                        current.lineNumber++;
                        current.lastLineNumber = stx.token.startLineNumber;
                        stx.token.startLineNumber = current.lineNumber;
                    } else {
                        current.lastLineNumber = stx.token.startLineNumber;
                        stx.token.startLineNumber = current.lineNumber;
                    }

                }

                return stx;
            }
            // handle tokens with missing line info
            stx.token.lineNumber = typeof stx.token.lineNumber == 'undefined'
                                    ? original.token.lineNumber
                                    : stx.token.lineNumber;
            stx.token.lineStart = typeof stx.token.lineStart == 'undefined'
                                    ? original.token.lineStart
                                    : stx.token.lineStart;
            stx.token.range = typeof stx.token.range == 'undefined'
                                    ? original.token.range
                                    : stx.token.range;

            // Only set the sourcemap line info once. Necessary because a single
            // syntax object can go through expansion multiple times. If at some point
            // we want to write an expansion stepper this might be a good place to store
            // intermediate expansion line info (ie push to a stack instead of
            // just write once).
            stx.token.sm_lineNumber = typeof stx.token.sm_lineNumber == 'undefined'
                                        ? stx.token.lineNumber
                                        : stx.token.sm_lineNumber;
            stx.token.sm_lineStart = typeof stx.token.sm_lineStart == 'undefined'
                                        ? stx.token.lineStart
                                        : stx.token.sm_lineStart;
            stx.token.sm_range = typeof stx.token.sm_range == 'undefined'
                                    ? stx.token.range.slice()
                                    : stx.token.sm_range;

            // move the line info to line up with the macro name
            // (line info starting from the macro name)
            if (stx.token.lineNumber !== current.lineNumber) {
                if (stx.token.lineNumber !== current.lastLineNumber) {
                    current.lineNumber++;
                    current.lastLineNumber = stx.token.lineNumber;
                    stx.token.lineNumber = current.lineNumber;
                } else {
                    current.lastLineNumber = stx.token.lineNumber;
                    stx.token.lineNumber = current.lineNumber;
                }
            }

            return stx;
        });
    }

:pandaface: Code from sweet.js_

It seems to be called in a few places, notably in expandMacro:

 function expandMacro(stx, context, opCtx, opType, macroObj) {
   // ...
   if(rt.result.length > 0) {
            var adjustedResult = adjustLineContext(rt.result, head);
            if (stx[0].token.leadingComments) {
                if (adjustedResult[0].token.leadingComments) {
                    adjustedResult[0].token.leadingComments = adjustedResult[0].token.leadingComments.concat(head.token.leadingComments);
                } else {
                    adjustedResult[0].token.leadingComments = head.token.leadingComments;
                }
            }
            rt.result = adjustedResult;
        }
}

:cat: Code from sweet.js

disnet commented 9 years ago

Right, as you've noticed the "real" lineNumber etc. have to be adjusted. Not sure if this is obvious from that code but the reason we have to do the adjustment is because of return:

// ...
function foo() {
  return m();
}

where m is some macro that expands to whatever. If the lineNumber of return and whatever m expands to aren't the same, esprima parses this as:

function foo() {
  return;
  // whatever
}

ASI is the worst.

But anyway the issue of the lineNumber for the source map coming from a different file is important. Here's the thing though, the current module implementation is sort of a hack and I doubt spending time fixing this issue is worth it. I'm currently working on landing "real" modules which should handle tracing the source maps to the actual file.

fasterthanlime commented 9 years ago

Ah, guess I can close node-debug and stop reading expander.js then :)

Thanks for the explanation! Eagerly awaiting "real modules" to land in a sweet.js release. Best of luck with that branch!

lantiga commented 9 years ago

:+1: