textmate / swift.tmbundle

TextMate support for Swift
72 stars 30 forks source link

Support for raw string literals #39

Closed u2606 closed 4 years ago

u2606 commented 5 years ago

Support for raw string literals

SE-0200, implemented in Swift 5, added support for raw string literals, which are string literals that add # characters to their delimiters, and that partially ignore escape sequences like \n and \\. The above-linked Swift Evolution proposal goes into greater detail about their design, but a quick overview follows.

Overview of raw string literals

Traditional string literals interpret character sequences beginning with \ as escape sequences, but raw string literals interpret these as literal characters:

"\n" // newline
#"\n"# // backslash, n

"\\n" // backslash, n
#"\\n"# // backslash, backslash, n

"\u{2603}" // ☃
#"\u{2603}"# // backslash, u, opening brace, 2, 6, 0, 3, closing brace

"\\u{2603}" // backslash, u, opening brace, 2, 6, 0, 3, closing brace
#"\\u{2603}"# // backslash, backslash, u, opening brace, 2, 6, 0, 3, closing brace

let num = 42
"\(num)" // 42
#"\(num)"# // backslash, opening parenthesis, n, u, m, closing parenthesis
"\\(num)" // 42
#"\\(num)"# // backslash, backslash, opening parenthesis, n, u, m, closing parenthesis

You can use any number of # characters as the delimiters of a raw string literal:

// All of the following strings are equivalent:
"good morning"
#"good morning"#
##"good morning"##
###"good morning"###
// et cetera

If you want to use an escape sequence in a raw string literal, the escape sequence includes the same number of # characters as the delimiters:

// All of the following strings are equivalent:
"\n" // newline
#"\#n"#
##"\##n"##

// All of the following strings are equivalent:
"\u{2603}" // ☃
#"\#u{2603}"#
##"\##u{2603}"##

// All of the following strings are equivalent:
"\(num)" // 42
#"\#(num)"#
##"\##(num)"##

Note, the reason you can use any number of # characters is to allow literal \# sequences in a raw string literal without escaping:

// All of the following strings are equivalent:
"\\#n" // backslash, hash, n (need to escape \\ to get \)
#"\#\#n"# // (need to escape \#\ to get \ before #)
##"\#n"## // (no need to escape)

Swift also allows combining raw string literals with multiline string literals:

// All of the following strings are equivalent
let one = "good\\n\nmorning" // g, o, o, d, backslash, n, newline, m, o, r, n, i, n, g
let two = """
    good\\n
    morning
    """
let three = #"""
    good\n
    morning
    """#

The issue: swift.tmbundle doesn’t support raw string literals

As seen in some of the examples above, swift.tmbundle highlights raw string literals as if they were traditional string literals. That means that some valid sequences involving \ characters are marked as invalid, and some invalid sequences involving \ characters are only partially marked as invalid:

"\d" // invalid escape sequence \d (correctly marked)
#"\d"# // backslash, d (incorrectly marked as invalid escape sequence \d)
#"\#d"# // invalid escape sequence \#d (incorrectly marked as shorter invalid escape sequence \#)
##"\#d"## // backslash, hash, d (incorrectly marked as invalid escape sequence \#`)
##"\##d"## // invalid escape sequence \##d (incorrectly marked as shorter invalid escape sequence \#)
jtbandes commented 5 years ago

Thanks for the report. Are you interested in making a PR to add this? Here's a similar example from a Rust bundle:

https://github.com/carols10cents/rust.tmbundle/blob/d788eebb847c7673360e5e2d85ab9a1dc877c871/Syntaxes/Rust.tmLanguage#L773-L788

u2606 commented 5 years ago

Yes, I’m interested in making a PR. I see that literal_raw_string uses a name of string.quoted.double.raw.rust in the example you linked. I can’t seem to figure out where that’s defined.

jtbandes commented 5 years ago

All scope names are just by convention, but the basic ones are defined here: https://macromates.com/manual/en/language_grammars So you could use string.quoted.double.raw.swift for example.

jtbandes commented 5 years ago

You might also want to refer to https://github.com/textmate/swift.tmbundle/pull/31

u2606 commented 5 years ago

Thanks, those resources should help.

jtbandes commented 5 years ago

Looking at this now – @infininight @sorbits is there a way to use begin capture groups inside patterns? The Swift raw string literals allow the same delimiter to be used on escapes inside the string, for instance ###"new \###n line"### but \1 doesn't seem to work the same way it does inside end...

sorbits commented 5 years ago

Looking at this now – @infininight @sorbits is there a way to use begin capture groups inside patterns?

There is not, no. So currently we cannot support the raw string escaping mechanism.

I do have an open issue about supporting ${variables} in patterns, which I have updated to include captures from the parent’s begin rule. Doesn’t solve the problem here and now, but on an infinite timescale… :)

jtbandes commented 5 years ago

OK, thanks. Maybe what I'll do is create a full set of rules including escapes for n=1, i.e. #"this \#n case"#, and then a general set of (#+)".."\1 delimiters that doesn't support escapes. It's important to at least try to get the string start/end boundaries right, but it sounds impossible to make it perfect (since an escaped subexpression could contain more raw strings: \#( ##"more string here"## ))

sorbits commented 5 years ago

Maybe what I'll do is create a full set of rules including escapes for n=1

That’s a great idea. I was about to suggest you could also do a rule for n=2 (before the general rule that just disables escape sequences), but come to think of it, I find it hard to believe that people would pick a string type where their newlines etc. are represented as \##n, it seems counter to the purpose of raw strings to have such weird escape sequences.

u2606 commented 5 years ago

It seems like raw string literals with more than one # are only useful when you want to express a string containing #", "#, or \# in your string, which aren’t incredibly common sequences of characters. The only use case I can think of for n=2 escapes is code that generates other Swift code. Even that seems rare and unlikely.

jtbandes commented 5 years ago

Implemented in #40