qwertie / ecsharp

Home of LoycCore, the LES language of Loyc trees, the Enhanced C# parser, the LeMP macro preprocessor, and the LLLPG parser generator.
http://ecsharp.net
Other
172 stars 25 forks source link

LES3: New multiline string syntax? #93

Open qwertie opened 5 years ago

qwertie commented 5 years ago

I have never been entirely comfortable with those triple-quoted multiline strings...

    str = '''The problem is,
          how many spaces are there after the newline
              and before the word "how"?
     ...And is that what the user /wants/?'''

I haven't used a triple quoted string for awhile ... I think the answer is 3 in this case, but no promises.

But another syntax occurred to me that is so obvious I wondered why I hadn't thought of it before, and why no popular languages use it.

The idea is simple: copy English. In English literature you can have a paragraph with an opening quote but no closing quote. This is valid so long as the next paragraph begins with an opening quote to continuing the prior quotation. Thus:

    str = "The problem is,
          "how many spaces are there after the newline
          "    and before the word \"how\"?
      "...quite clearly the answer is zero."

In a string of this style, it seems reasonable for the parser to add a \n character at the end of each line, otherwise the developer would have to carefully control the number of spaces at the end of the previous line for proper continuity, which is not ideal as spaces are conspicuously invisible. The parser could force a certain number of trailing spaces (one?) but that seems worse (heavy-handed and increases lexer complexity).

So yeah, the parser should add \n on each line (even if the source file actually has Windows/Mac newlines).

The user can presumably merge strings across lines with "..." + "...", except not really: this cannot be taken for granted in LES, and so perhaps there should also be a syntax for multiline strings without line breaks in them - the C string merging rule comes to mind, but it wouldn't work without a continuator:

    str = "perhaps there should also be a syntax for multiline "
    |     "strings _without_ line breaks in them"

Custom literals should need their type code specified only once:

    str = md"Poem
            "====
            "
            "_There once was a man from Nantucket_"

Shout out to @jonathanvdc, haven't chatted in awhile...

jonathanvdc commented 5 years ago

Hi @qwertie! I know I haven't been very present lately. I've been working on my master's thesis over the course of the last semester and that's kept me rather busy. I'm presenting my master's thesis this Wednesday. We'll see how that goes.

About the syntax you're proposing here: I like it. It's a breath of fresh air. I occasionally use triple-quoted strings in programming languages that support them and the leading whitespace at the start of every line can be a major nuisance.

One quote per line introduces a bit of asymmetry, but that's well worth it. Your examples look super readable, too. It's a shame that existing programming languages like Python don't have a syntax that gives people more control over whitespace. What you're proposing here could be an interesting precedent.