Draco-lang / Language-suggestions

Collecting ideas for a new .NET language that could replace C#
75 stars 5 forks source link

String and character literals V2 #71

Open LPeter1997 opened 2 years ago

LPeter1997 commented 2 years ago

Introduction

This issue aims to completely redesign the string literals, inspired by Swift string literals. The reason is that it essentially does everything the new C# literals do, but it's a more cleaned-up and less complicated version of them. I'd like to take this opportunity to slightly change character literals a bit to free up the single-quote character.

Escape sequences

The escape sequences would stay and be identical to what's already been specified.

Single-line string literals

Single-line string literals would start and end with double quotes and they can not span multiple lines. Example:

val x = "Hello, World!";

They can also contain the usual escape sequences:

val x = "Hello,\nEarth! \u{1F47D}";

In this latter example, the value of x would be

Hello,
Earth! šŸ‘½

Multi-line string literals

Multi-line string literals would start and end with 3 double-quotes. The string would start in the next line after the opening quotes and end before the line of the closing quotes. Example:

val x = """
Hello, World!
""";

Note, that this string has no newlines in it. It is equivalent to the string "Hello, World!". If you want a leading or trailing newline, you can do:

val x = """

Hello, World!

""";

The placement of the ending quotes determine the amount of whitespaces cut off from each line. Example:

val x = """
    Lorem ipsum dolor sit amet,
    consectetur adipiscing elit,
    sed do eiusmod tempor incididunt
    ut labore et dolore magna aliqua.
""";

Here, nothing is cut off, the string is exactly

    Lorem ipsum dolor sit amet,
    consectetur adipiscing elit,
    sed do eiusmod tempor incididunt
    ut labore et dolore magna aliqua.

But if we indent the ending quotes, we can cut off the leading whitespace:

val x = """
    Lorem ipsum dolor sit amet,
    consectetur adipiscing elit,
    sed do eiusmod tempor incididunt
    ut labore et dolore magna aliqua.
    """;

Now the string is

Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua.

Breaking long lines

Breaking long lines in multiline strings can be done using a \ at the very end of lines. Example:

val x = """
Hello, \
World!
""";

Which equals to "Hello, World!". This looks similar to C-style line continuations, but this is only valid in multiline-string literals.

Note, that # changes the sequence here too (later section specifies what these are):

#"""
Hello, \
World!
"""#

Is literally

Hello, \
World!

To have Hello, World!, you'd write

#"""
Hello, \#
World!
"""#

Interpolation

Interpolation introduces a new escape sequence, namely \(, which starts the interpolation expression until the matching ). For example:

`"1 + 2 = \(1 + 2)"`

Which would result in the string 1 + 2 = 3.

Alternatively, we could use \{ ... } or any other pairwise character.

Extended string delimeter

The escape-sequences and starting and ending sequences of string literals of both single- and multi-line strings can be changed, to make pasting literal strings easier. This is done by appending the same amount of # characters before the starting quotes and after the ending quotes.

For example, if we want to paste the literal string 1 + 2 = \n \(1 + 2), we could write it as: #"1 + 2 = \n \(1 + 2)"#.

Escape sequences can still be used, using the specified amount of # characters for the string. For example, ###"Hello,\###nWorld!"### becomes:

Hello,
World!

This works for both single-line, and multi-line strings.

We could change the way escape sequences would be specified or simply changle the # character. The simplicity of this method seems quite elegant.

The simplest way we could summarize the behavior, is that the number of #s modify the escape sequence:

Another example:

#"""
a = 5 + '\r'
idontrememberpython = """
  heheh e\n \n \r \u093
"""
\#u{1F47D}
"""#

which becomes

a = 5 + '\r'
idontrememberpython = """
  heheh e\n \n \r \u093
"""
šŸ‘½

Character literals

The single-quote character could be very valuable to us in other ways. Since character literals are not that significant, I'd like to suggest merging them in with string literals.

Since there have been discussion about prefixing the literal with the encoding used - u8 "Hello", or u16 "Bye" for example -, we could do the same to turn a string-literal into a character literal using the char prefix, as long as it actually represents a single character. For example, char "a" would be the character literal a. String interpolation would not be allowed, as that would require runtime checks.

jl0pd commented 2 years ago

This issue doesn't mention alignment and formatting.

Using # may lead to complexities in repl scenarios (ambiguity with preprocessor directive)

LPeter1997 commented 2 years ago

This issue doesn't mention alignment and formatting.

Indeed, because we haven't proposed anything for it. Alignment and formatting is a largely inelegant part of .NET (at least as in C#) IMO and I find it a tough problem to tackle.

Using # may lead to complexities in repl scenarios (ambiguity with preprocessor directive)

This assumes the presence of a preprocessor which we might simply not need šŸ˜„ . But fair, we could think of some other character that's easier to type.