eclipse-archived / ceylon.formatter

A formatter for the Ceylon programming language, written in Ceylon.
Apache License 2.0
14 stars 11 forks source link

Multi-line strings #12

Closed lucaswerkmeister closed 10 years ago

lucaswerkmeister commented 10 years ago

We’ll need to handle multi-line strings specifically, since we might be moving their beginning around.

lucaswerkmeister commented 10 years ago

How do we align the subsequent lines of the multi-line string? I read through ceylon/ceylon-spec#577, and I think the safest strategy would be:

<tb><tb><tb>print("line 1
<tb><tb><tb>.......line 2");

(where the periods represent spaces) that is, use the tabs only for indentation up to the start of the first line’s code, and then spaces for alignment with the start of the multi-line string. This should work for any tab width both from the viewpoint of the user and of the compiler.

If we do it this way, I think we don’t even need ceylon/ceylon-spec#866.

lucaswerkmeister commented 10 years ago

Hm, the FormattingWriter doesn’t have any notion of “string literals“ beyond the writeToken method, and not even necessarily there – but I still need to know how to align the following lines down the road: they’re not aligned so that they all start at the same column, the quote(s) of the first line are excluded from that.

Hacky solution: When writing the text, jump over any quotes when determining where to pad. Nice solution: Add that info to the token when we still have a chance of knowing what the token is (i. e. in writeToken). Advantage: Easily expandable if there should ever be other multi-line token types, general niceness; Disadvantage: We won’t be able to properly align non-AntlrTokens. (Of course, we can still implement the “hacky” solution if we get a String in writeToken.)

lucaswerkmeister commented 10 years ago

Dang, I wanted to close that issue, but I forgot again that “Implements #XXX” only works on BitBucket.

Oh well, might as well use it for the following known issue: Lines inside multi-line strings are simply trimmed, so indentation inside the string (e. g. code blocks in doc strings) is lost. I’ll have to only substract a specific amount of indentation, but I’m not even sure where I would get that information.

lucaswerkmeister commented 10 years ago

I think I’m going to subtract column width from the beginning of the string, counting tabs as however wide the options say they are (defaulting to 4 if the options don’t specify a tab width), then padding with spaces as necessary (e. g. the line started with two 4-wide tabs, we want column 5).

lucaswerkmeister commented 10 years ago

Oh no... the column needs to be based on the start of the string in the unformatted code, not when we’ve already corrected the first lines.

This is horrible. I need direct access to the unformatted code for this. Gaaaah!

or... I could cheat and assume that no one writes code like this:

value s = "first line
...............
...............this is indented, as was the previous empty line
...........
...........instead, i expect that the lines surrounding a code block
...........are not already indented – the line below the code block would be what I expect";

i. e., take the column from the first “own” line. I’m not sure how sane or reasonable this assumption is.

lucaswerkmeister commented 10 years ago

OOOOHHHHH

Looking at how the ceylon compiler handles this was a good idea. here.

someToken.getCharPositionInLine()