Generation of source maps

nippur72 commented 10 years ago

I don't know how hard is to implement this, but it would be a very cool feature to have source maps (.map files) generated along with the javascript file, in order to enable debugging directly from Visual Studio.

Typescript has this feature, as well as other to-javascript languages like CoffeeScript.

About source maps, see http://www.html5rocks.com/en/tutorials/developertools/sourcemaps/

erik-kallen commented 10 years ago

I have tried to implement it before, and I have come to the same conclusion as Kevin Gadd (creator of jsil): the sourcemap spec is too fussy and underspecified to be useful for anything but the simplest scenarios. I have even thought about implementing my own source-level debugger (using Chrome's API) rather than trying to get that crap to work.

nippur72 commented 10 years ago

It would be a good starting point if you could provide an experimental "-map" compiler switch and output the source mapping in any format you prefer (e.g. a simple text dump of all source/emit locations for all statements and expressions).

Then we (I) can attempt to write a separate tool to convert from your format to true .map files. And once it's proved to work, it could be incorporated in the compiler.

nippur72 commented 10 years ago

Looking at the source code, this change seems to me not easy at all, it looks like the whole JsModel need to be integrated with map information coming from the Ast. Is that right? Or is there some other quicker way?

erik-kallen commented 10 years ago

No, it's not easy at all. What would be needed is to extend the JS model with

class JsSequencePointExpression : JsExpression {
    public int StartRow { get }
    public int StartCol { get; }
    public int EndRow { get }
    public int EndCol { get; }
    public string File { get; }
}

and insert such sequence points where desired. Then the OutputFormatter should process these sequence points by outputting some mapping information whenever it encounters such a sequence point expression.

Then, exactly how hard it is to do depends on which granularity is required for the output. Just mapping C# statement -> JS statement(s) can probably be done without too much of a problem; the problem is that the JS source map "specification" provides no guidance on this part.

nippur72 commented 10 years ago

I was looking at typescript's source code and it has a decent class for handling source maps. Would be of help if I convert it to C# ? Just as a starting point.

erik-kallen commented 10 years ago

Perhaps, I don't know exactly what source map handling does. We still need the sequence points. And, we need to be able to decode the "spec" to find out exactly what they mean with many things WTF is the meaning of the names array and the indexing into it? How do we specify both the start and end columns of the generated script? How does it all fit together?

nippur72 commented 10 years ago

My insane idea is to save us from understanding the actual implementation details by reusing the code TypeScript is using for its compiler.

I've isolated and converted to C# the source map part in this repo SourceMaps.CSharp if you want to have a look at it. It's not working yet, but I intend to spend some time studying it and eventually modify to our needs.

All seems to be done by the Emitter class with methods like Emitter.recordSourceMappingSpanStart(ISpan ast) and similar. The emitter writes in the SourceMapper class and once all files are passed, you can get the .map files with SourceMappper.emitSourceMapping().

I'll tell you more as soon as I understand better how it works.

nippur72 commented 9 years ago

Good news: I've converted the code from dart2js so now I have a working C# class for handling javascript source maps. And it is rather easy to use it.

From what I've understood so far, names specified in the map files are used only for function call stack tracing, apparently there is no other use (like referring to variable names and so on). Most of the time, names can be omitted.

So in the end, a map file is just a collection of (source location,dest location). Token "Length" is not necessary as it is calculated using previous and next map entries.

All the work is done by the browsers/Visual studio, if there are more or less maps than necessary they are "intersected" with legal breakpoints, so one can't do wrong.

I've written a small working example in the above mentioned repo SourceMaps.CSharp as well as some notes.

The code is not ready yet (need a deep refactoring). Please can you tell me how do you want it to be called from the compiler? (assuming you want to integrate it into Saltarelle)

Basically there is a SourceMapsBuilder class that is called with .addMapping(sourceloc,destloc,name) on every token, and a .build() method that returns a string with the map file to be written. Source location must include source file name, as the source can span over several different files.

Please let me know how do you want to proceed.

erik-kallen commented 9 years ago

I think we better wait with this feature as the whole shebang needs to be converted to Roslyn, a huge project which I don't know how long it is going to take.

erik-kallen commented 9 years ago

@nippur72 If you want to help, please let me know and we can work on this feature together.

I envision a JsSequencePointStatement which the compiler will emit at certain points, and the OutputFormatter will use the information in this statement to record a map while emitting scripts.

nippur72 commented 9 years ago

I want to help, but consider my knowledge of the compiler is very limited.

The JsSequencePointStatement is a good idea assuming that source maps do not need "span length" information (my actual understanding is so). This is a crucial point because it let us generate source maps without touching expressions or statements generation. We need only to insert JsSequencePointStatements at expression/statement granularity.

As a starting point, I suggest we first make the compiler generate an intermediate text file (e.g. .map.txt) written by the OutputFormatter made like this:

sourcefilename,sourceoffset,destoffset
[...]

and also change the generated JavaScript by adding the line at the end:

//# sourceMappingURL=myapp.js.map

Then we can experiment on the .map.txt file with an external tool outside of the compiler in this stage.

nippur72 commented 9 years ago

Forgot to tell: there is a cool feature we are going to loose with source maps: hovering mouse pointer over variable names and get the related tooltip. This Visual Studio feature works only if the source code is pure JavaScript (and not C# driven by source maps).

erik-kallen commented 9 years ago

I think a good way of plugging this component into the compiler would be to create an interface like this:

interface ISourceMapGenerator {
    void EmitLocation(int scriptRow, int scriptCol, string sourceFile, int sourceStartLine, int sourceStartCol, int sourceEndLine, int sourceEndCol)
}

and have the output formatter call into this interface periodically. What do you think of that? Would it solve the problem?

nippur72 commented 9 years ago

yes, or better:

interface ISourceMapGenerator {
    void addMapping(int scriptOffset, string sourceFile, int sourceStartLine, int sourceStartCol, int sourceOffset);
    void build();
}

as there is no need for "end" information. The actual code I have in the SourceMapBuilder example is:

Uri sourceMapUri = Uri.parse("http://www.mysite.com/myapp.map");
Uri fileUri      = Uri.parse("http://www.mysite.com/myapp.js");
SourceMapBuilder sourceMapBuilder = new SourceMapBuilder(sourceMapUri, fileUri, target);
foreach(var e in sme) sourceMapBuilder.addMapping(e.targetOffset,e.sourceLocation);
String sourceMap = sourceMapBuilder.build(); 
SaveToFile(@"..\..\..\Website\myapp.js.map",sourceMap);

erik-kallen commented 9 years ago

What are the scriptOffset and sourceOffset parameters in your example?

nippur72 commented 9 years ago

it's the offset from start of file in characters. It's equivalent to the Line/Column pair, but more convenient because it can be expressed with a single integer. The source code I have uses both, but of course if you have one you can calculate the other.

erik-kallen commented 9 years ago

It's not that easy to map between line/column and position. I thought they decided to separate that field in the sourcemap spec because they don't want issues with eg. line-ending encodings.

erik-kallen commented 9 years ago

@nippur72 I just made a few preparations for you to check out on the source-maps branch.

There is a class called SourceMapGenerator that is ready to be filled in with working code. The RecordLocation method will be called for each sequence point, and the WriteSourceMap method will be called to actually write the source map to a file.

The source map will be written to a file filemap.js.map in the obj\ directory, and the .js file has a //# sourceMappingURL comment added to it.

Currently, the only two statements that generate sequence points (and therefore source mappings) are variable declarations and expressions,

nippur72 commented 9 years ago

ok, I started to work on it.

BTW, there seems to be a small bug, scriptColumn is always 1:

      int method(int b)
      {
         var a = 53;
         a=a*b;
           a=a/50;
              a=a/50;
                 a=a/50;
         return a;
      }

produces the following .map

(14,1) -> (Class1.cs, 12, 10)
(15,1) -> (Class1.cs, 13, 10)
(16,1) -> (Class1.cs, 14, 12)
(17,1) -> (Class1.cs, 15, 15)
(18,1) -> (Class1.cs, 16, 18)

erik-kallen commented 9 years ago

Fixed

nippur72 commented 9 years ago

a small problem: the browser has to locate the source C# files for displaying in the debugger, but in Saltarelle source files are under a separate project and thus not accessible by the browser.

Do you think we should copy the source files (replicating the directory structure) in the website folder?

For the moment I'm solving by having a linked folder called sources pointing to the C# project folder. It's quick but not practical for deploy (assuming deploying source files makes sense). What do you suggest?

erik-kallen commented 9 years ago

No, copying the source files should not be the responsibility of the compiler. If you need this for testing purposes, you can create a post-build event to do it.

nippur72 commented 9 years ago

true, an xcopy *.cs /s as post compile event should be enough

erik-kallen commented 9 years ago

Additionally, I am not sure I think the sourceMappingURL directive should be emitted either, IMO it should be the responsibility of the user to either add this directive to the file during a post-build step or add it in an HTTP header. For example, jQuery does not include that directive anymore.

nippur72 commented 9 years ago

Best would be if source maps (and the related sourceMappingURL directive) were generated only in debug configuration (DEBUG symbol is defined).

erik-kallen commented 9 years ago

IMO it should probably depend on the "Debug Info" project setting. The problem with the sourceMappingURL directive is that in order for it to work, the script needs to know the (relative) URL to the .map file, which it is not certain that it does. The .map file might not be deployed to the same directory as the .js, and even if it is then the .map file must know the relative location of the source files (which will probably not be deployed to the same directory as the .js).

By not appending it, each project can make its own decision on where the map and sources are published.

nippur72 commented 9 years ago

yes we need a way to tell where the sources will be located, as this info needs to be encoded in the map file itself. Actually I'm assuming they are in "/scripts" (.js and .map) and "/sources" (.cs).

Some good news: mouse hovering over variables seems to work during debug, and there is also syntax highlight for C# source code. Have a look at this picture, see how the "variable declaration" spans up to next statement.

erik-kallen commented 9 years ago

@nippur72 I have now pushed a new version of the source-maps branch which supports sequence points for all statements. The only thing that will not (yet) work well is source locations inside state machines.

Do you know if there is support in the sourcemaps format for parts of the file that have no source location at all. We need this for

The parts of the file between class methods (such as class registration), and
In state machines there are a lot of generated code that do not have a source location at all.

nippur72 commented 9 years ago

I don't think there is a special sourcemaps format for "no-source" parts. I guess if there's a piece of javascript that has no source location to refer to, source maps for that piece should be simply omitted.

erik-kallen commented 9 years ago

Looking at the spec, I can't find any way of doing this as there is no member in a mapping for "end position in the generated script".

nippur72 commented 9 years ago

I tried to hack it (e.g. putting line out of bounds), but then realized the simplest way is to map compiled javascript onto itself. So if a C# statement has no source, that statement should generate a map entry where the source file is the output javascript, e.g.:

// sourceLine = scriptLine 
// sourceCol = scriptCol
// sourcePath = scriptPath
RecordLocation(scriptLine, scriptCol, scriptPath, scriptLine, scriptCol);

erik-kallen commented 9 years ago

That could work, but I wonder how the user experience will be. It might be OK for the infrastructure calls (eg. class hierarchy setup), but I don't think it will work well for state machines.

nippur72 commented 9 years ago

in the case of state machines, the users would get the javascript code, which is just what they get now without source maps, so nothing to worry about IMO.

Of course it will be weird to debug state machine code (as it is as of now), but there are no big alternatives. One is to have an additional source file like statemachine.txt made only of labels (eg. "state machine initializing" etc) where to point to. But it would be still weird as there is no way to tell the debbuger to "skip" debugging until a certain point.

erik-kallen commented 9 years ago

I wonder what happens if we say the source comes from a file with the name no-source-location. A problem with the "JS as source" option is that if the script is minified, we would point to the minified source.

nippur72 commented 9 years ago

mapping a file that doesn't physically exist results in debugger showing an empty source code page, so you don't see anything but the debugger still waits for f-keys and steps through statements (all this in Google Chrome).

As regards minification, yes that would be a problem, but IMO debugging and minification are mutually exclusive in practice: if I have to debug, I turn off minification; if I have to deploy, I turn off debugging and minify files.

erik-kallen commented 9 years ago

Sorry, I didn't see your statemachine.txt suggestion. That might actually be a nice idea. Or create the no-source-location file with the content No Source Location and point to that. IDK, let's do some tests and evaluate the experience (although it might take me a while to get sequence point generation right in state machines, that code is quite complex).

erik-kallen commented 9 years ago

@nippur72 I have now implemeted support for sequence points everywhere I can think of. For the "no source location" issues, I chose the no-source-location.txt file approach because that was the simplest. Please feel free to try it out.

Now it is time to think about how to complete this feature. I am leaning towards something like:

The compiler generates the .map file. All source locations in this file are relative to the project (.csproj) root. For linked files we need to emit the aliased location (as it appears in the project) rather than the physical location.
The .targets file contains a step that package the .map and all required source files into a .zip archive (again using aliases for linked files).
Optionally, we can have an msbuild parameter that controls whether to include the //# SourceMappingUrl comment and, if so, the relative path to specify in that directive.

What do you think?

nippur72 commented 9 years ago

Unfortunately I don't know much of msbuild (I've always used it "as is"), so it's not clear to me how we can control it, and how the user can specify the options from the IDE. Anyway I was thinking about three compiler switches like these: --generatesourcemaps to turn on/off the option --sourcemappingurl= to specify the location of the generated .map file --sourcesroot= to specify the root location of .cs sources

as regards 2. (.targets), does that mean that the .map file will be generate ONLY in the .zip file?

erik-kallen commented 9 years ago

For controlling whether to generate the source mapping URL, I think we can use the --debug switch (which maps to the DebugType task parameter, which in turn maps to the Debug Info project setting). Possible values are full/pdbonly/none, and I'm thinking

full = generate non-minified script and .map
pdbonly = generate minified script and .map
none = minified script, no .map.

--sourcesroot will be necessary for the .exe (but can default to the working directory), for IDE we can use the .csproj directory.

The compiler will still generate the .map so it will be somewhere. The only question is whether it should be copied from obj\ to bin\, and I think we can as well do that.

nippur72 commented 9 years ago

ok for the --debug and --sourcesroot as switches. The other --sourcemappingurl switch I think it's still needed for the rare case where you want to deploy .js and .map separately, but that's just and edge case and we don't need to bother of it now.

Copying the .map from obj\ to bin\ simplifies the post-build command on the IDE, but that's just a minor issue.

erik-kallen commented 9 years ago

I just pushed a version which generates a .map file, and also a source archive (.map.zip). Please feel free to check it out and comment. (currently only works when using MSBuild, the executable is not yet fixed).

nippur72 commented 9 years ago

I tried it, but now it's difficult to test it comfortably, there is no longer the //# sourceMappingUrl= reference in the .js file so you have to put it everytime you recompile.

Also, there is need for sourcesRoot because the .map file points to the website root where it expects to find the .csproj directory structure.

Regarding the debugging experience, I noticed there are some quirks, but I did not spend much time investigating them (I'll wait a more testable version).

Please note: you have to source-map the token ; too, otherwise Visual Studio debug highlight will span up to the start of the next statement. Browser-based debuggers doesn't have this issue because they are line-based (that is, they highlight the whole line).

erik-kallen commented 9 years ago

Adding these lines to the default property group in the .csproj file will achieve what you want:

<AddSourceMapDirective>true</AddSourceMapDirective>
<SourceMapSourceRoot>source</SourceMapSourceRoot>

What do you mean with source-map the ;? How is this supposed to be done? Does the JS ; need to be mapped to the C# one (that's going to be hard), or is it possible do do something cheaper? Many JS ; do not have a C# counterpart as statements do not have a 1:1 correlation between input and output.

Can we add a bunch of mappings to the end of the .map file that map the last character of the JS to all ;s in the C#?

Do end braces have the same problem? What about the empty space between an if statement and its opening brace?

nippur72 commented 9 years ago

Ok, I investigated the ; and no-source-location issues:

It seems that the source maps v3 (the one we are using) already supports the no-source feature, that is when you have something in your javascript that doesn't have a correspondent in any of your source files.

This is achieved by just skipping the source reference in the base64 encoding stuff; you simply create a map entry with the compiled-file coordinates only (line/column). In our Saltarelle source map implementation, this is achieved by passing null as sourceLocation. So instead of

_sourceMapBuilder.AddMapping(scriptLine - 1, scriptCol - 1, sourceLocation);

it's enough to:

_sourceMapBuilder.AddMapping(scriptLine - 1, scriptCol - 1, null);

But there is a big BUT: apparently the Visual Studio debugger doesn't support this (2013 Update 2). It refuses to load the .map file entirely. I suspect when it finds the null reference it wrongly infers that the .map file is invalid and thus skips it. Chrome, Mozilla and Explorer handle it correctly instead.

So until that bug is fixed (assuming it's a bug--I'm just guessing) we should continue to use no-source-location workaround you have already written.

Regarding the ; problem, yes, it could be a solution to map the last char of the js file to all C# semicolon, commas, and other characters. It doesn't matter it's a dummy location as it's never shown by the debugger. I only fear the .map file will become huge.

erik-kallen commented 9 years ago

I don't think we'd need to map commas. Only semicolons, and probably end braces.

Though I'm not sure of VS's behavior in this case is standard or just a quirk of it's implementation.

nippur72 commented 9 years ago

Regarding commas, I haven't tried it, but consider the following C# code

a = b++, 33;

when the debugger breaks on b++ it will highlight after the comma (if it was a string it would be "b++, ").

Anyway I suggest to don't bother much of this issue at this stage. Let's give us time to find an acceptable workaround.

BTW, Internet Explorer is the only browser (of the triad) that has a full debugger (no line-only I mean). It behaves much like VS, being able to step over different points on the same line. Nice.

erik-kallen commented 9 years ago

That is not a problem since we only map on the statement level, so it will highlight the entire line.

Saltarelle / SaltarelleCompiler

Generation of source maps #297