Compile issues when commenting out blocks of code (within object literals)

gkz / LiveScript

LiveScript is a language which compiles to JavaScript. It has a straightforward mapping to JavaScript and allows you to write expressive code devoid of repetitive boilerplate. While LiveScript adds many features to assist in functional style programming, it also has many improvements for object oriented and imperative programming.

http://livescript.net

MIT License

2.32k stars 156 forks source link

Compile issues when commenting out blocks of code (within object literals) #1101

Closed LeXofLeviafan closed 4 years ago

LeXofLeviafan commented 4 years ago

Suppose we have an object literal:

x =
  foo: 1
  bar: 2
  baz: 3

Now, what would I normally do to temporarily remove some fields from the code? In theory, commenting it out should do it, whether the commented out lines are removed or considered empty. Let's check if replacing the field with an empty line works:

x =
  foo: 1

  baz: 3

var x;
x = {
  foo: 1,
  baz: 3
};

Seems about right (even works if the line contains some whitespaces)… Let's try commenting it out using single-line comment:

x =
  foo: 1
#  bar: 2
  baz: 3

var x;
x = {
  foo: 1,
  baz: 3
};

Works as expected. But what do I do when the field(s) span multiple lines? Sounds like a job for the multiline comment:

x =
  foo: 1
/*
  bar: 2
*/
  baz: 3

…Except this produces an error: Parse error on line 3: Unexpected 'INDENT'.

Using it in a single line produces the same error:

x =
  foo: 1
/*  bar: 2*/
  baz: 3
# => Parse error on line 3: Unexpected 'INDENT'

Removing newline after the comment results in a separated object value, of all things:

x =
  foo: 1
/*
  bar: 2
*/  baz: 3

var x;
x = {
  foo: 1
};
({
  /*  bar: 2
  */
  baz: 3
});

Finally, removing the newline before the comment gives us the result that we wanted:

x =
  foo: 1/*
  bar: 2
*/
  baz: 3

var x;
x = {
  foo: 1,
  baz: 3
};

…Oh, and indenting the comment does actually work as well:

x =
  foo: 1
  /*
  bar: 2
*/
  baz: 3

var x;
x = {
  foo: 1
  /*
  bar: 2
  */,
  baz: 3
};

…Naturally, the expected behaviour for comments in all these cases would be to not affect the code around them (beyond being considered a wordbreak, when applicable).

vendethiel commented 4 years ago

Because these comments are part of the AST and are parsed with the same mechanism.

rhendric commented 4 years ago

To expand on what @vendethiel said: multiline comments, as noted in the documentation, differ semantically from single line comments. Single line comments are whitespace, and are parsed as such. Multiline comments appear in the output, and thus are treated by the parser as part of your code. To truly comment out a number of lines of LiveScript code, I recommend prefixing them all with #, and reserving the multiline comment for things like documentation strings and license annotations.

Does that make your examples make more sense? I think there's no bug here, though I'm open to being convinced otherwise.

LeXofLeviafan commented 4 years ago

The whole idea of a comment is that it's a piece of text that doesn't affect the code around it any more than a regular whitespace… And there certainly shouldn't be any difference between how single-line and multiline comments are treated by the compiler – they only differ in how the borders are defined. At worst, the newlines within the multiline comment would count, but treating a comment as syntax? That contradicts the concept itself.

To truly comment out a number of lines of LiveScript code, I recommend prefixing them all with #

Well, this is literally what the purpose of a multiline comment is supposed to be – to make the compiler ignore a block of text, instead of having to mark every single line in it separately as a comment.

rhendric commented 4 years ago

And there certainly shouldn't be any difference between how single-line and multiline comments are treated by the compiler – they only differ in how the borders are defined.

Ah, but that is not the relevant difference. Because there exists tooling that consumes comments in JavaScript for various purposes (generating API documentation, typing, collecting license information, and so on), it follows that tools that generate JavaScript, even if only for machine consumption, may need to generate comments as well. This creates a conceptual separation between ‘pure’ comments in the input text, which do not affect compilation in any way, and some other mechanism that generates comments in the compiled output, which is obviously an instance of affecting the compilation. You could argue, based on the principle that a comment must be ignored by the compiler, that this other mechanism therefore doesn't qualify as a ‘comment’, in which case fine, let's call them ‘comment generators’. So, if we adopt those terms for this thread: LiveScript has single line comments, but it doesn't have delimited comments. It does have delimited comment generators, but not single line comment generators. The delimited comment generators are those things delimited by /* */ in LiveScript code; the documentation calls them multiline comments, but since they aren't ignored by the compiler, it's better to think of them as code elements with a homoiconic syntax. If you're passingly familiar with the React ecosystem, maybe a comparison with JSX is helpful: with JSX (LiveScript), you can write HTML tags (JavaScript block comments) in your code—of course they aren't real HTML tags (JavaScript block comments), because your JSX (LiveScript) file isn't an HTML (JavaScript) file, but the syntax is designed to resemble what the result will be.

Perhaps LiveScript needs a third syntax for true multiline comments, in addition to delimited comment generators? I tend to think not. Languages like Python and Bash get by with only line comments, and any decent code editor has a shortcut for commenting out a block of code by prefixing each line with a comment symbol. Furthermore, line comments are frequently cited as superior for temporarily disabling chunks of code because ‘nesting’ blocks of line-commented code is trivially supported, whereas many languages that support /* */ syntax (including both JavaScript and LiveScript) don't support nesting those delimiters. So I don't think LiveScript is missing much by omitting a multiline comment syntax.

vendethiel commented 4 years ago

anguages like Python and Bash get by with only line comments, and any decent code editor has a shortcut for commenting out a block of code by prefixing each line with a comment symbol.

Please no.

LeXofLeviafan commented 4 years ago

Isn't that just a wordplay around current technical implementation? "It's implemented like this, so let's call it an operator instead."

I don't really care what the technical story here is, but from the actual programming point of view there's no such thing as "comment generator" (as we write LiveScript code, not macros for JavaScript): there's code, and then there's comments. Comments are whitespace, and don't cause compile errors where whitespace wouldn't. The resulting transpiled code is irrelevant to semantics, as it's more or less an implementation detail. (Not to mention that there's at least one case where multiline comment does produce intended result, with the comment itself being omitted, so the whole thing with it being a special type of operation with specific result doesn't fly, especially as there's in fact several observed behaviours that have nothing to do with the stated purpose.)

rhendric commented 4 years ago

In Python, this is legal:

def my_fun():
   "just a docstring"
   pass

This is not:

def my_fun():
"just a docstring"
   pass

But docstrings are comments, no? They certainly aren't code. It's perfectly legal to do this:

def my_fun():
# just a comment
   pass

And the runtime behavior of the function is the same, as long as you aren't looking at the __doc__ property, which for most code is an unimportant implementation detail. Yet one is legal and the other isn't. Did Van Rossum screw this up?

Of course not. As @gkz points out here when justifying this design, as soon as you need to associate a comment-like piece of syntax with some part of the rest of your code, that syntax has to follow the the rules of code layout. It doesn't matter if, from your point of view, you want to think of that syntax as a comment. What matters is the compiler isn't just ignoring it—in both Python and LiveScript, there is a documented feature stating that the compiler must not ignore it. And if the compiler isn't ignoring it, it has to be indented correctly.

Now, it's true that in Python the docstring just reuses the syntax for a string literal (which is certainly code), and in LiveScript, the multiline comment borrows the syntax from JavaScript's block comment (which is undeniably whitespace). But the surface syntax is irrelevant. The semantics are what matters. Both Python's docstrings and LiveScript's multiline comments may not have any semantic effect on the execution of the surrounding code in the typical case, but that isn't the same as having no semantics. The semantics of a language can include not only runtime behavior, but also information about the elements of a program that may be observable via reflection or via external tools. Python's docstrings and LiveScript's multiline comments don't have the same semantics as whitespace, even though the latter borrows the syntax of a whitespace construct from another language. That's not a technical implementation detail; that's just a logical consequence of the languages' specifications (such as they are).

As for the case where the multiline comment doesn't appear in the output, well, that was intended to be fixed at some point. It probably won't now, but that was the original intent.

LeXofLeviafan commented 4 years ago

The difference is that docstrings aren't comments; they very certainly are code statements that are simply treated as metadata when in particular AST locations (them not producing any side effects doesn't make them "not code"). A multiline comment is a comment, I'm not just "thinking" about it as a comment. If I wanted a multiline string, I'd use one – there's several options for that in LiveScript.

And multiline comment in LiveScript has no semantics of its own – it can be used as a medium to pass data to some tools working with JavaScript code, but in the context of the language itself it's a comment and nothing more.

Bottom line is, this discussion is jumping around the fact that the correct behaviour here, regardless of what you think about the potential role of multiline comments in JavaScript infrastructure, is to compile the code as it should be compiled, with comments not affecting the execution result whether or not they are inserted in those places within the transpiled code. It certainly is possible, and it certainly should not depend on the indentation level of the first line of the comment, because regardless of what you put within it, it remains a comment first and foremost.

rhendric commented 4 years ago

Your argument, as I understand it, is:

‘Multiline comments’ are comments;
The indentation of comments should not affect whether code compiles;
Therefore, the indentation of ‘multiline comments’ should not affect whether code compiles.

The reason this discussion is jumping around so much is that I am willing to grant you (1) or (2), but not both together, because which one I agree with depends on how the word ‘comment’ is defined. If we define ‘comment’ as ‘whatever appears under the Comments section in the language spec’, then (1) is a given, but (2) is something we can basically choose freely. If we define ‘comment’ as ‘any visible characters which have no effect on the execution of code’, then again I will grant you (1), but (2) does not follow, with Python docstrings as a counterexample.

If we define ‘comment’ as ‘visible characters which the language compiler ignores or treats as whitespace’, then I dispute (1), because the LiveScript compiler does not ignore multiline comments, and this is not a hidden implementation detail but a documented feature. If we define ‘comment’ as ‘visible characters which not only have no effect on the execution of code, but also don't affect the abstract meaning of surrounding code structures (which is roughly what I would call “semantics”)’, then it depends on what you consider ‘meaning’—if metadata counts, then (1) is false; if it does not, then Python docstrings meet the definition of a comment and (2) is false.

So please provide me with the definition of ‘comment’ that you would like to work with, or make an argument that doesn't hinge on whether a ‘multiline comment is a comment’.

LeXofLeviafan commented 4 years ago

This (2) isn't something to "choose freely", it's part of the comment definition. A comment is arbitrary text which doesn't affect execution result of the code, which can be placed arbitrarily and can contain whatever because it's a freaking comment. And if it doesn't apply to some comment because of how its processing is implemented internally in the compiler, that's not because it's "a thing that is very much like a comment but for some metaphysical reason should not be considered one", it's because the implementation doesn't process the comment correctly.

As I said before, Python docstrings aren't comments, they're regular statements that happen to be processed situationally by the interpreter (and even that is internal to the language itself); so any statement regarding them has nothing to do with comments in general. Same goes for repurposing comments for means like providing metadata – it's entirely external to the language itself and does not affect the definition of a comment (giving an external purpose to something doesn't change what it is).

LiveScript compiler does not ignore multiline comments, and this is not a hidden implementation detail but a documented feature

It doesn't ignore them, technically, but that is an implementation detail, because this fact doesn't make them "not comments"; and the only thing it actually does to them is paste verbatim into transpiled code, which can be considered a nice feature but isn't supposed to change how that code itself behaves (as they're, in fact, comments). So considering them "not comments" simply because they're comments that are passed along to JS version of the code doesn't make any sense – they're still comments in the end.

Your definition of metadata is weirdly based on semantics, but you're mixing up the context of the language itself and the external tools that can be used with it. Regardless of what extra information you intend (or don't intend) to pass along via comments, that doesn't suddenly make them "not comments" merely by virtue of being capable to pass it. And within the context of LiveScript itself, there's no concept of metadata in the first place – you only have code, which can be executed, and you have comments, which should not affect runtime in any way or form.

So please provide me with the definition of ‘comment’ that you would like to work with, or make an argument that doesn't hinge on whether a ‘multiline comment is a comment’.

And that's why I'm calling this 'wordplay': you're basing your counterarguments to "comment is a comment regardless of which syntax is used to mark it as a comment" on the idea that "if there's an external-to-the-language possibility to repurpose comments for something else, and if after converting the code to a different language the comment doesn't disappear, that comment somehow ceases being a comment within the current language".

Also, oddly enough, in the language docs, multiline comments are listed as comments. Go figure.

rhendric commented 4 years ago

This (2) isn't something to "choose freely", it's part of the comment definition. A comment is arbitrary text which doesn't affect execution result of the code, which can be placed arbitrarily and can contain whatever because it's a freaking comment. And if it doesn't apply to some comment because of how its processing is implemented internally in the compiler, that's not because it's "a thing that is very much like a comment but for some metaphysical reason should not be considered one", it's because the implementation doesn't process the comment correctly.

Then by that definition, LiveScript multiline comments are almost comments, but not quite, because they can't be placed completely arbitrarily.

Perl Pod comments (the weird =begin/=cut syntax) are also, in this light, not comments, because those directives have to appear at the very beginning of a line. Ruby has a very similar feature. CoffeeScript's block comments (delimited by ###) are also restricted in their placement: the ###s can't share a line with code, and like LiveScript, they have to be indented to the current indentation level of the surrounding code. All of these are called comments, but since they can't be placed completely arbitrarily, by your definition, they aren't comments.

And this is fine.

What I meant when I said that (2) is something to choose freely is that, had they really wanted to, Perl and Ruby probably could have made their Pod-style comments a little more flexible. It would probably have meant more complexity in their parsers, and maybe it would result in more edge cases to puzzle over, and for those reasons the language designers chose not to make those types of comments completely unrestricted in where they are placed in code. These languages all do slightly different things with their comment syntax, and that's okay. Every language designer gets to decide how their language works based on what they're trying to do with it, and different goals will result in different decisions. If you come in with the mindset that all comments have to be able to be placed at any column in code, then I guess none of these things are truly comments to you. They're language features that share many qualities with comments, but not that one. If that helps you feel more comfortable with the language, great. Or you can call them comments—the language docs do, after all—and get on board with the idea that not every comment syntax has to work exactly the same way in every language.

And that's why I'm calling this 'wordplay': you're basing your counterarguments to "comment is a comment regardless of which syntax is used to mark it as a comment" on the idea that "if there's an external-to-the-language possibility to repurpose comments for something else, and if after converting the code to a different language the comment doesn't disappear, that comment somehow ceases being a comment within the current language".

I'm sorry that argument was a distraction. I probably shouldn't have made it. I still think it's valid, but I'm happy to think of LiveScript as essentially a template language that generates JavaScript, and anything that affects the output is clearly code. After re-reading the thread, it's evident that you prefer to view LiveScript as a language that stands apart from its backend, which is also a fine perspective, and I should have spent more words comparing LiveScript's comments to comments in other languages that are also restricted in syntax, as I did above.

vendethiel commented 4 years ago

Perl Pod comments (the weird =begin/=cut syntax) are also, in this light, not comments

They're not comments at all. They're not meant to be. Perl doesn't claim to have multiline comments, just that POD blocks can be used as a quick-and-dirty hack.

rhendric commented 4 years ago

Ah, you're right. Ruby does call them comments, though.

vendethiel commented 4 years ago

Yes, Ruby is a naughty child that doesn't have POD, and just stole =begin =end to be its multiline comment syntax instead.

LeXofLeviafan commented 4 years ago

All of these are called comments, but since they can't be placed completely arbitrarily, by your definition, they aren't comments.

Nope. "By my definition", they are crappily implemented comments. You seem to be basing your definition of what the code is meant to work on how the current implementation happens to do things, rather than on its semantic specification, which is a very backwards approach at best. (How do you even define a 'bug' if current behaviour of the code itself apparently defines what it's supposed to do in the first place?…)

I'm happy to think of LiveScript as essentially a template language that generates JavaScript

Well, that's the thing though: LiveScript isn't a macro system, it's a language. It compiles to JavaScript because JavaScript is the runtime of webpages, but it's defined by what the code is supposed to do rather than what JavaScript it's supposed to emit. If you want JavaScript macros, there's Sweet.js for that.

rhendric commented 4 years ago

Nope. "By my definition", they are crappily implemented comments. You seem to be basing your definition of what the code is meant to work on how the current implementation happens to do things, rather than on its semantic specification, which is a very backwards approach at best.

’Crappily‘ according to what spec? Ruby's documentation calls those things comments, and says what the rules are. They are spec-compliant. If you're going to call spec-compliant things crappy, don't come in the very next sentence and assert that everyone should do things according to specs!

Like most hobby languages, LiveScript's ‘spec’ is pretty damn loose. We have a website which documents the language, but incompletely and imprecisely in places; we have the word of the language designer in several older issue threads; and we have the interpretations of all that case law of the current and former language maintainers, two of whom are here on this thread with you. That's our ‘spec’. I am genuinely sorry it isn't more formal. So how I define a bug is I see if an expected behavior contradicts any of those things—the written documentation, the word of the creator, or my current sense of what can be extrapolated from those things, interpreted conservatively (i.e., biased against change). In this case, all the documentation says is that multiline comments are ‘comments’, but from my experience with other programming languages, I know that just calling something a ‘comment’ doesn't absolutely imply all the things that you're asserting about comments. I have ~~an issue~~ (sorry, two issues) with the creator stating that he has looked upon this behavior and seen that it was good. And I have my own sense that things that end up in the AST after parsing are, generally speaking, less comment-y than things that don't. All that together tells me that this behavior is not a bug. Is this a ‘very backwards approach at best’? As someone who is currently keeping the lights on in someone else's house, what is it you think I should be doing differently?

LeXofLeviafan commented 4 years ago

The 'backwards' part is specifically defining the desired behaviour of code by what the current implementation happens to be doing at the moment, rather than by what it's supposed to do by sane application of logic. And I mean logic in context of the language itself, rather than its build artifacts.

…Look, I'm not asking for much really – just treat comments as comments, instead of contextually randomizing the processing. Like how CoffeeScript does it:

x =
  foo: 1
#  bar: 2
  baz: 3

x =
  foo: 1
  #bar: 2
  baz: 3

x =
  foo: 1
    #bar: 2
  baz: 3

x =
  foo: 1
###  bar: 2###
  baz: 3

x =
  foo: 1
  ###bar: 2###
  baz: 3

x =
  foo: 1
    ###bar: 2###
  baz: 3

x =
  foo: 1
  ###
  bar: 2
  ###
  baz: 3

x =
  foo: 1
###
  bar: 2
###
  baz: 3

x =
  foo: 1
    ###
  bar: 2
    ###
  baz: 3

var x;

x = {
  foo: 1,
  //  bar: 2
  baz: 3
};

x = {
  foo: 1,
  //bar: 2
  baz: 3
};

x = {
  foo: 1,
  //bar: 2
  baz: 3
};

x = {
  foo: 1,
  /*  bar: 2*/
  baz: 3
};

x = {
  foo: 1,
  /*bar: 2*/
  baz: 3
};

x = {
  foo: 1,
  /*bar: 2*/
  baz: 3
};

x = {
  foo: 1,
  /*
  bar: 2
  */
  baz: 3
};

x = {
  foo: 1,
  /*
    bar: 2
  */
  baz: 3
};

x = {
  foo: 1,
  /*
  bar: 2
    */
  baz: 3
};

See? It's not that hard, and the user doesn't have to struggle with random quirks of the compiler to do a simple thing like comment out a few lines, just because the comments happen to end up in the output as well. Because, from point of logic of someone who actually uses the language for development (as opposed to playing around with transpilation), there's absolutely no reason for a comment to cause a compile error just because it wasn't indented it in some very specific way.

And if you're really adamant on keeping the blatantly wrong behaviour of producing compilation errors depending on indentation levels of what should be considered an empty line, at least change the docs to reflect that (e.g. "Multiline comments are preserved in the output but only work if they're indented as if they were code because that's how they're implemented.").

rhendric commented 4 years ago

‘Not that hard’, hah... CoffeeScript screws up the thing LiveScript does at least somewhat well here.

I've offered several justifications for the current design, in several different ways, attempting to adapt to your terminology and your approach to PL design philosophy, and the fact that you keep throwing around phrases like ‘absolutely no reason’ and ‘blatantly wrong’ tells me that the time I'm taking with these responses is being wasted; you aren't looking for an opportunity to learn and understand from other points of view than your own. I'm sorry we couldn't have had a more productive conversation, and I hope you find a way to comment your code that works for you.

I will consider changing the documentation to reflect what I've learned from you; it may become something along the lines of:

Comments

LiveScript has single line comments that start off with a #. Comments are ignored by the parser and are not passed through to the compiled output.

# single line comment

Annotations

Annotation syntax can be used to insert JavaScript block comments in the compiled output for documentation, type annotations, license information, or anything else that needs to appear. Annotations are not whitespace; like anything else that gets parsed, they should be indented with the surrounding code if they appear on their own line.
/**
 * @param a Number 
 * @returns Number
 */
x = (a) -> a + 1

vendethiel commented 4 years ago

See? It's not that hard

You clearly haven’t read the code or followed the amount of work this has been.

LeXofLeviafan commented 4 years ago

I understood your justification the first time around; only, it's based on backwards logic as far as actual coding is concerned ("design from implementation" is exactly how language philosophy shouldn't be done). The whole idea that "X shouldn't be considered X because its implementation can't handle trivial cases" is nowhere near being a solid base for any claim of it being correct behaviour… which is what I've been trying to convey for a while now, but apparently had not succeeded whatsoever.

…And of course, dismissing "they do this correctly" by saying "well they have a bug someplace else" and ignoring the core of the argument is a great example of 'learning and understanding new points of view'. (I admit that my wording isn't all that great here… Should've said "not impossible" rather than "not that hard", working late at night doesn't really go well with doing rhetorics; though of course I hadn't expected this to be the point that would be nitpicked as me being dismissive to complexity of the parser as a whole.) I'm not claiming to be perfect, but in the end, neither of us changed their perspective on the topic, so trying to show a moral higher ground by pointing fingers comes off a little crass, non?