Kotlin / KEEP

Kotlin Evolution and Enhancement Process
Apache License 2.0
3.29k stars 357 forks source link

Multi-dollar interpolation #375

Closed serras closed 2 days ago

serras commented 1 month ago

This KEEP proposes multi-dollar interpolation. The current full text of the proposal can be found here.

We propose an extension of string literal syntax to improve the situation around $ in string literals. Literals may configure the amount of $ characters required for interpolation.

lukellmann commented 1 month ago

Marking the end of the string is done using as many " symbols as those beginning the string.

Won't this change the meaning of code like this:

// 1)
// prints "foo" now
// would print foo in the future, the literal is starting with 4 "
println(""""foo"""")

// 2)
// prints "foo now
// would fail to compile in the future
println(""""foo""")

I have written code like 2) recently to create single line JSON literals:

val json = """{"id":"0","guild_id":"0","name":"rule","creator_id":"0","event_type":1,"trigger_type":3,""" +
    """"trigger_metadata":{},"actions":[],"enabled":false,"exempt_roles":[],"exempt_channels":[]}"""
//  ^^^^
//  notice the 4 " here
OliverO2 commented 1 month ago

I understood the proposal to imply that the number of leading/trailing " symbols can only be changed from the standard 1/3 for strings prefixed with at least one $ symbol. So it is backwards-compatible and the above examples would be interpreted in the same way as before.

AarjavP commented 1 month ago

also curious, in this example:

$$"$${order.product} costs $ $${order.price}"

will removing space after costs $ cause any issue or will it just print costs $150?

zarechenskiy commented 1 month ago

also curious, in this example:

$$"$${order.product} costs $ $${order.price}"

will removing space after costs $ cause any issue or will it just print costs $150?

It will print costs $150. The rules for interpolation should naturally extend the existing ones, and right now it's possible to use two dollars consecutively, one as a sign and the last one to mark interpolation:

fun main() {
    val price = 150
    println("costs $$price") // costs $150
}
zarechenskiy commented 1 month ago

Marking the end of the string is done using as many " symbols as those beginning the string.

Won't this change the meaning of code like this:

// 1)
// prints "foo" now
// would print foo in the future, the literal is starting with 4 "
println(""""foo"""")

// 2)
// prints "foo now
// would fail to compile in the future
println(""""foo""")

I have written code like 2) recently to create single line JSON literals:

val json = """{"id":"0","guild_id":"0","name":"rule","creator_id":"0","event_type":1,"trigger_type":3,""" +
    """"trigger_metadata":{},"actions":[],"enabled":false,"exempt_roles":[],"exempt_channels":[]}"""
//  ^^^^
//  notice the 4 " here

It won't, as we'd like to have backwards-compatible rules and require the dollar sign for the new rules. However, this approach implicitly introduces a new kind of string literals, and that is a pretty high price for this issue. We'll easily run into subtle differences because the $ in front of a string literal will have two meanings:

println(""""f"""") // "f"
println($""""f"""") // f
val id = "x"
println(""""$id"""") // "x"
// if I just wanted to use '$' sign:
println($$""""$id"""") // $id; quotes are missing

And there are a lot of single-line strings that start with more than three " symbols: https://github.com/search?q=%22%22%22%22+language%3AKotlin&type=code&ref=advsearch

Peanuuutz commented 1 month ago

Indeed, the dollar escaping rule looks good to me, but the subtle difference between two versions of """" bothers me a lot. I'd like to propose the following option:

A string literal, whether single line or multi-line, when starting with N $, requires exactly N $ after the 1 or 3 quotes to close. When this happens, interpolation should also start with the corresponding number of $.

val num = 0

val sinOne = $"$num"$
> 0

val sinTwo = $$"$num"$$
> $num

val sinEscape = $"\"$num\""$
> "0"

val rawOne = $"""
    {
        "key": $num
    }
"""$.trimIndent()
> {
>     "key": 0
> }

val rawTwo = $$"""
    {
        "$num": $$num
    }
"""$$.trimIndent()
> {
>     "$num": 0
> }

val rawPseudoEscape = $"""""""""$
> """

val rawPseudoEscapy = $$"""$$"""$"""$$
> $$"""$
serras commented 1 month ago

I've pushed a new version of the proposal, removing everything related to double quotes. That way the KEEP is only about multi-dollar interpolation.

Thanks @lukellmann for noticing the problems with the proposed approach so fast. 🚀

serras commented 1 month ago

I think the case of embedding """ inside a raw string literal happens way less often than the need of embedding $ in such a literal. Given that, I think that @Peanuuutz's proposal would add too much ceremony to most of the strings.

sken130 commented 1 month ago

But if we don't solve the triple double quotes problem, then the whole proposal is only a partial solution, and it wastes the syntax space.

Nowadays more and more languages are using triple double quotes, and we will have more and more chance to encounter the need to embed them. When we regret today's decision, we will have to introduce even more syntax to solve it.

I am not requesting to solve every unknown problems, but at least please solve the problems we do know now.

Peanuuutz commented 1 month ago

How is adding just a trailing sequence of $ considered "too much ceremony"? For most of the multi-line literals the actual content is way more noticeable than the beginning and ending characters. If $ is needed then just a pair of $$, and the triple quote part doesn't even need any change as it's just solved. Two birds with one stone. Why drawing back for later?

Is "short" better than "good"? I truly hope not.

serras commented 1 month ago

Sorry, I didn't mean that we shouldn't look for the best solution, and certainly using words like "ceremony" wasn't good on my side. My apologies.

The problem statement, as I see it, is the following. We would like to find a way to include """ in a multiline string; but without falling in the "trap" of then having a new sequence of characters we cannot include. The solution with an increasing amount of " is one such solution, as for every amount of " you want to include, you just need to make the initial/end markers one " longer.

Can we imagine another such solution? In particular, @Peanuuutz, does your proposed solution satisfy these requirements?

Peanuuutz commented 1 month ago

Can we imagine another such solution? In particular, @Peanuuutz, does your proposed solution satisfy these requirements?

After I tested a few cases, the answer is no. There's only one very edge case where my proposal will fail.

val i = 1

$"""
   """
"""$ // Good

$"""
    """$i
"""$ // Compile error

$$"""
    """$i
"""$$ // Good

$$"""
    """$$i
"""$$ // Compile error

That is, now I can't have """ and a consecutive interpolation.

Please note that, even though this happens, we can't allow the surrounding quotes to grow as it then falls into the original situation where quotes are gone after a $ is prepended, see comments above me.

Actually, after I read the original ticket, I'm more in favor of a new representation - '$"Hello $name"$'. I've considered several configurations, like changing the rule of $ or " or the combination, and it always ends up with a single edge case not possible to write or introducing a very sneaky change (the quotes are gone with just a $) which is raised above. I truly feel like a model with the $ in between the starting/ending quotes is what we need, because that way we can safely ends a literal as the $ is always followed by a single quote (meaning it can never be an interpolation within the string), and we can have multiple $ if $ is included.

val i = 1

'""'
> (Empty)

'"\n"'
> \\n

'"$i"'
> 1

'$"$i"$'
> 1

'$$"$i"$$'
> $i

'$$"$$i"$$'
> 1

'$""'"$'
> "'

'$""$i'"$'
> "1'
Amejonah1200 commented 1 month ago

Firstly, thx serras for writing all those KEEPs 🙏

Secondly, I find the solution to the problem very nice, esp. because I wished for it to have in Kotlin as I saw it being added to C# 11.


Concerning string, well... as I write Rust, there is no "multiline" strings, as the strings can accept newlines like that anyway. So there is only ##"..."## there. About indentation: C# 11 also introduced the ability to trim the indent by having a specific amount of space characters before the closing """. The only limitation I see there, that it is a compile error if there is a line where the indent is lower, I would've removed that limitation.

OliverO2 commented 1 month ago

Trimming the indent for multi-line strings at compile time sounds attractive. Why add bloat and bear the cost of invoking trimIndent() at runtime?

JakeWharton commented 1 month ago

trimIndent and trimMargin already run at compile-time (and have for years) if the string is a constant. If the string captures variables, however, you cannot perform the operation and it is deferred to runtime.

ephemient commented 1 month ago

I do think there may be value in having a way to "compile-time trim this multi-line string, but not the expressions interpolated into it", but we don't have a way to express that now.

serras commented 1 month ago

Trimming the indent for multi-line strings at compile time sounds attractive.

As discussed at the beginning of the KEEP, we've decided to split this concern to another KEEP which is in the works. The reason why trimming is harder is because of possible interactions with string templates.

Amejonah1200 commented 1 month ago

Trimming the indent for multi-line strings at compile time sounds attractive.

As discussed at the beginning of the KEEP, we've decided to split this concern to another KEEP which is in the works. The reason why trimming is harder is because of possible interactions with string templates.

@serras did you see the C# trimming for """ strings? What's your take on that?

serras commented 1 month ago

did you see the C# trimming for """ strings? What's your take on that?

Yes, I've seen how they did it. However, whatever solution we come up with, we need to be cautious not to break backwards compatibility, and trying to get some uniformity across the language. When our team discussed the issue, we came to two conclusions:

  1. Having multiline strings with $ and without it behave differently with respect to trimming is not good. We would like to provide a solution which allows "fixing" both kinds of literals,
  2. There are different ways people want trimming (in the Kotlin community, people use both trimIndent and trimMargin). We have to acknowledge that, and not force everybody into the same trimming behavior.

As hinted above and by this message, we are working on this. However, it may take longer to reach a solution.

serras commented 4 weeks ago

After some discussion, we've decided not to handle the problem of three double quotes in a multiline string.

We acknowledge that this solution does not solve the problem of escaping (three or more) " characters inside a multiline string. The workaround is using ${"\"\"\""}, or similar code which interpolates a single-line string with the three symbols.


Our preliminary code search for usages of """" (that is, using a multiline string literal with double quotes inside) shows that this is a relevant pattern (7.2K usages), so we should not proceed with the change of making the closing block of double quotes match the opening one.

In contrast, code search for usages of ${"\"\"\""} reveals only 140 usages in GitHub. This shows that the need here is quite narrow, and we prefer a simpler extension of syntax (adding only $ at the front) instead of changing both begin and end markers.

OliverO2 commented 3 weeks ago

Do I understand correctly that the case for using a single-dollar prefix $"""...""" is now moot since the quote interpretation rule change has gone?

If so, wouldn't it be reasonable to change the proposed syntax so that a prefix now requires 2 or more consecutive $ symbols, and the single $ symbol is disallowed?

serras commented 2 days ago

The KEEP has been merged. Thanks everybody for their comments and suggestions!