questions about intended behaviour regarding comments and empty lines

jokeyrhyme commented 8 months ago

Thanks so much for sharing this project! <3

I am testing this commit: https://github.com/nushell/nufmt/commit/4d578493d937c0b49bfd7f1c29a903e642732ab3

Test cases:

nufmt --stdin "# hello\n\n\nlet foo = 'abc'\n\n\n# goodbye"

output:
# hello
let foo = 'abc'

# goodbye

nufmt --stdin "# hello\n\n\n# goodbye"

output:
# hello

# goodbye

I understand the experimental/early nature of this project, and I'm a huge fan of nu, so I'm curious about what idiomatic code looks like regarding comments and empty lines:

should nufmt preserve all empty lines, or is it idiomatic to have gaps no larger than a single empty line?
is it idiomatic to have "floating comments" that have one or more empty lines between them and the nearest code?
is it idiomatic to have multiple "floating comments" in a sequence, separated by one or more empty lines?
is it idiomatic for all comments to always be anchored to the next nearest code (e.g. no empty lines in between)?

Cheers! <3

fdncred commented 8 months ago

We've started this https://github.com/nushell/nufmt/blob/main/docs/specification.md and there's another one but I can't seem to find it. Ugh, my memory.

jokeyrhyme commented 8 months ago

@fdncred bah, how did I not see that, haha

Okay, so it doesn't look like "floating comments" (or whatever they ought to be called) are really a thing (yet?)

fdncred commented 8 months ago

I wish I could find the other .md file. IIRC, it was a bit more detailed. I'm not sure what "floating comments" are.

jokeyrhyme commented 8 months ago

I may have coined the term today, haha

What I mean is comments that aren't anchored to any valid non-comment syntax/statement/expression, they're just "floating"

Not usage / --help information, and not an explainer on any specific line of code, they're just present of the prose of the code

Maybe I'm strange and I'm the only one who comments like this: https://gitlab.com/jokeyrhyme/dotfiles/-/blob/6293bf4ea581f81f0d61ef368869b5989dd279eb/bin/xdg-terminal-exec

I suppose in rustdoc terminology ( https://doc.rust-lang.org/rustdoc/index.html#outer-and-inner-documentation ), it seems like "outer documentation" is what we already support here, e.g. "this comment is about the item that is immediately after"

Whereas, my (silly?) "floating comment" is more like the "inner documentation" (without necessarily being consumer-facing documentation), e.g. "this comment is about the item that encloses this comment"

amtoine commented 8 months ago

imo, something like

nufmt --stdin "# hello\n\n\n# goodbye"

should give

# hello

# goodbye

i.e. a single empty line in between

AucaCoyan commented 8 months ago

Whoa, there is a lot to unpack here, let me go step by step. I hope I can answer all!

First and foremost, a disclaimer: yesterday I found a very interesting article talking about how formatters are made and the corresponding hackernews thread is full of useful comments too. This gave me a lot to think and I realized I need to restructure a lot of the code that I commited yesterday 🏖️

Second, let me explain what is the current way of nufmt to resolve comments: The code is read from a file, and then is sent to the parser. As almost any other parser, it strips away every comment. So it gives me back a list of tokens and structures like "this is an expression", "this expression has a closure", "the closure has def keyword in it followed by a string", "next there is an argument", "yet another argument" and so on. It doesn't give me any comment, but gives me the starting and ending position of every token. Because I have the source file, I can scan the file and compare if there are any # characters in between the end of a token and start of a new token. This line here That said, the parser only "cares" from the start of the first token to the last token of a block (more on this block later), so what happens to the comments before the code, and after the final "logic" line of the file? They are written just as they have been read from the file. So every license in front of every file, or last comments explaining some stuff stays exactly as written. With the comments inbetween code, I strip the whitespace, so if you write

# initial comment that is not read by the parser, will be just here forever

let my_var = 1;
# this should work 👇

def "my special funcion" [] {}

# final comment that also should not dissapear

the lines between # this should work and the function definition should disappear. Not doing anything useful as blank space, right?

Now comes the caveat: this samples given by jokey have a special characteristic:

# hello

# goodbye

It doesn't have any parser-relevant logic. What do we do with that files? Parser responds: "there is absolutely nothing in this file". That why I simply return the same file. It is a really edge scenario that you write a nu file only with comments. It's never going to run anything Fun story: before implementing this logic, when I formatted a file that only comments the formatter returned an empty file. So it deleted all the contents 🤣

So, going back to the questions:

should nufmt preserve all empty lines, or is it idiomatic to have gaps no larger than a single empty line?

is it idiomatic to have multiple "floating comments" in a sequence, separated by one or more empty lines?

is it idiomatic for all comments to always be anchored to the next nearest code (e.g. no empty lines in between)?

is it idiomatic to have "floating comments" that have one or more empty lines between them and the nearest code?

Half of the answer is explained above, and the other half: my personal preference is no, I don't like floating comments (floating comments it's now a thing! 👏🏼) But we can write it down as an objective and set sails to satisfy that on the future

To be honest, we don't have yet a very thoroughly detailed specification other than fdncred linked before.

As always, you can tag me for any question! I really like this project ❤️

jokeyrhyme commented 8 months ago

I often have links to specifications in my code, which aren't desirable to show up in usage information (which is what nushell does with anchored comments), but they are useful when the developer is reading the code

But, it's hardly a deal breaker to just expose specifications and other related URLs to users

nushell / nufmt

questions about intended behaviour regarding comments and empty lines #55