Option to preserve comments

arv commented 11 years ago

Original author: arv@chromium.org (February 19, 2013 23:16:07)

To make traceur useful for scripts that uses Closure Compiler JSDoc type annotations it seems important to allow us to keep comments and even generate comments.

Original issue: http://code.google.com/p/traceur-compiler/issues/detail?id=203

arv commented 11 years ago

From usrbi...@yahoo.com on March 21, 2013 20:58:17 Continuing the conversation from

https://codereview.appspot.com/7487046/

Peter was adding TypeScript type annotation and he was going to add a static type check pass. Adding runtime asserts is another option. I also think those are valuable to be able to turn on for debug mode.

If the JSDoc annotations were preserved in the traceur output, would Closure Compiler give us a typechecker for free?

arv commented 11 years ago

From arv@chromium.org on March 21, 2013 21:11:54 Not for free. We would have to add some annotations because we don't put @constructor nor @extends in comments since those are all in the source now.

arv commented 11 years ago

From peter...@google.com on March 21, 2013 21:19:00 There are a lot of lessons that can be learned from the JsCompiler type system. Check out the TypeScript type system by comparison. Emitting JsDoc type annotations is not a great long term strategy.

arv commented 11 years ago

From usrbi...@yahoo.com on March 22, 2013 21:04:52 I looked over TypeScript briefly (the repo, the website, the community), and I have to say, it was very educational. Microsoft definitely has a certain flavor, and it comes through strongly in everything they do.

In a hurry, so I'll touch on that later.

I didn't have a chance yet to test out the typechecking of TypeScript, but just looking over the syntax, it's more expressive than Closure Compiler's type declaration syntax (you can declare functions taking function parameters -- that in turn take certain parameters).

I agree that magic comments are not very satisfying to use, but in terms of functionality, it seems like they should theoretically be equivalent.

What's your take on the type-checking ability of Closure compiler vs TypeScript? I assume it's more a function of specific implementations as opposed to the choice of using JSDoc vs syntax?

I notice that TypeScript also supports JSDoc. I didn't look too closely into how that interacts with their native type annotations.

I do like the 'var x: type' style of type declaration; it's a lot less fiddly to parse. I kind of wish C used that style.

samcday commented 10 years ago

@arv I'm keen on getting comments attached to parse tree. Looks like comments are being handled by the parser, but just not attached. Do you have any branches lying around where this was pursued further?

Would you be interested in me implementing this and putting up a PR? If so - anything I should be aware of?

arv commented 10 years ago

@samcday I would love to make this happen and all help is appreciated. If you want to drive this I would be very happy too.

I've tried a few different solutions for this but I don't want to taint you yet.

Would you mind outlining what approach you were considering?

samcday commented 10 years ago

@arv I took a look at this whilst stuck on an international flight. With no internet I sketched out an algorithm to attach the comment nodes to an already built tree. This seemed to me to be the most simplistic initial stab. I took a look at how esprima attaches comment nodes in its parse tree for some inspiration.

My initial thoughts looked like this:

Collect all comments from handleComment callback into a stack.
Do a depth-first walk of the AST, when a node is encountered with start location offset that is greater than the offset of the comment at the head of the stack, attach the comment as a "leading comment" to that node, and then push that comment onto the head of a second stack.
- After initial walk, do another depth-first walk, but this time, check if the current node end location offset is greater than head of the second comment stack. If it is, attach as a "trailing comment" on the current node, and unshift that comment off the second stack.

With this approach, there was some edge-cases to deal with, but as I was investigating them, I actually stumbled upon this code in estraverse. I realised looks very similar to the approach I was taking :).

The problem I can see with this approach is that it's pretty inefficient, requiring us to walk through the AST twice in order to properly attach comments. When peeking at esprima's internals, it looks like they take a pretty different approach, attaching comments as they go. My understand of their approach is a little fuzzy right now, but I think it goes something like this:

As comments are handled, accumulate them in an array.
Each time a node is created, check the ranges of the current node and attach any relevant leading comments (based on range-based location information, which is tracked internally even if you don't ask to have it attached to the final AST).
Check the last created node to see if any of the comments accumulated since creating that node should be attached as trailing comments.

This is much more efficient of course, because the comments are attached on the fly. From an implementation standpoint though, esprima is in a much better position to do this, as all created nodes go through a WrappingNode constructor, so there's a single point that this code can be implemented. As the current Traceur Parser stands today, concrete nodes from ParseTrees are created all over the place, so we'd need to update every single construction. I think this shouldn't be too bad - the way I'd see it working is wrap every newly constructed tree node in the Parser with a method that does any additional post-processing needed. For now it would just be comments, but perhaps it might be useful for other stuff later too.

I've actually already incorporated a ghetto-version of approach #1 into my own project to get some stuff off the ground, but I'd still love to help out and get this properly integrated into Traceur core though. Let me know what you think about what I've come up with so far, and how you think I should proceed.

arv commented 10 years ago

I prefer the second approach too. Performance matters.

Conceptually, AST nodes are supposed to be read only. At one point we even froze the objects to enforce this. However, I think we can make an exception for adding comments. As long as we do not mutate the AST nodes during transformation I think we are fine.

Re additional post-processing: I think we could move the this.getTreeLocation_(start) cruft we are currently doing too (in a follow up of course)

samcday commented 10 years ago

Okay great! I'll start working on option #2 and get a PR up in next few days.

heydenberk commented 10 years ago

Has there been any progress on this? I'd love to be able to use run code through traceur compiler before it runs through the closure compiler, but need to be able to preserve the annotations. I started hacking on it a little bit but I'm not sure I understand the intended implementation well enough to issue a PR with any likelihood of being merged.

samcday commented 10 years ago

I haven't had a chance to get onto this unfortunately. It's still pretty high on my priority list though, so if nothing has happened within the next month I'll most likely be rolling up my sleeves.

a-bentofreire commented 10 years ago

I'm also very interested about the progress in the topic. I use closure compiler, and the removal of comments from traceur prevents me from using it. The comments aren't just important for closure compiler, but also to other tools that I use in the pipeline.

arv commented 10 years ago

@samcday I thought of a possible way to do this...

https://github.com/arv/traceur-compiler/compare/tree-find-comments-in-source

The idea is that when the writer writes a tree, it checks if it has a location. Since the location contains a reference to the original source we can therefore look to see what comes before the tree node and check if it is a comment.

This way allows us to print comments from the original source.

I'm not sure this is an acceptable hack or just a hack that should be thrown away.

a-bentofreire commented 10 years ago

At the current moment, I'm on vacations, but within 2 weeks I can run some tests, if that helps.

The issue for me is fundamental, since it's the only reason that prevents me from using traceur.

samcday commented 10 years ago

Hey guys. I'll have a stab at it this weekend.

@arv that approach is neat 'cos it's pretty simple, but it does look kinda hacky. I'm gonna try an implementation based on the design I detailed above (which was a 3 months ago now, my how time flies!).

jonrimmer commented 10 years ago

+1 This is also causing problems between Traceur and Angular's ng-annotate tool, as the latter uses annotations in comments to explicitly trigger transformations.

mlegenhausen commented 10 years ago

+1 ng-annotate support would be nice

qraynaud commented 10 years ago

:+1:

I also need this for a Node.js module I wrote that optionally depends on annotations before function arguments like:

function(/* annotation*/ arg) {}

Sadly, the annotation method for my module is the cleaner of the two provided.

zdychacek commented 10 years ago

+1

NervosaX commented 10 years ago

+1 I have to disable traceur until the day that comments are preserved. It's just impossible to debug a complicated application afterwards if the line numbers don't match up.

mkatrenik commented 10 years ago

+1

tearwyx commented 10 years ago

+1

cfmano commented 10 years ago

+1

samcday commented 10 years ago

So I did take a look at this a few weeks ago. I haven't had a chance to take it over the line yet though.

One thing I did realise though is that there's possibly two separate concerns here. First one is making sure that comments are attached to the ES6 AST when it is being parsed. After that though, special care will probably need to be taken to ensure that those comments are preserved when transforming the tree into an ES5 equivalent.

For example, if I put a comment right before a yield statement, where should we expect that comment to wind up in the ES5 equivalent? Right before the corresponding return statement I guess?

arv commented 10 years ago

I agree that a lot of transformers might need to be updated to keep the comments but we can deal with that later.

darlanalves commented 10 years ago

Just stumble upon this issue too. Traceur is in a pipe of several gulp plugins handling an entire project build. I can't annotate AngularJS code properly for injections. I'm also trying to automate the AngularJS dependency declarations from annotations, like @controller My/Controller, similiar of what Vojta Jina is doing here

So, +1²

mgol commented 10 years ago

@darlanalves In ng-annotate you can use the ngInject() syntax now which doesn't rely on comments so should be Traceur-safe.

darlanalves commented 10 years ago

Yeah, I know, It's working also if I write the functions separated from the declaration, but I'm going a step further. I want to turn this:

/**
 * @controller The/FooController
 * @ngInject
 */
function The_FooController($scope, $http) {
}

... into this:

function The_FooController($scope, $http) {
}
The_FooController.$inject = ['$scope', '$http'];
The_FooController.$providerType = 'controller';
The_FooController.$name = 'The/FooController';

and then pipe the files to a plugin that will run this script

darlanalves commented 10 years ago

I'm thinking of having my AngularJS with all the stuff I'm registering right now in the $injector inferred from annotations.

gregglind commented 9 years ago

+1 on this as well. Even an 80% solution that keeps comments before the right block most of the time would help.

(Use case: 'ignore' for istanbul and other code coverage tools.)

capaj commented 9 years ago

:+1: would help with those //ngInject comments

google / traceur-compiler

Option to preserve comments #206