ballerina-platform / ballerina-spec

Ballerina Language and Platform Specifications
Other
167 stars 53 forks source link

[StringTemplate] Support escape sequence \n #1253

Open pcnfernando opened 1 year ago

pcnfernando commented 1 year ago

Description: There are times when new lines in a string template need to be escaped. For instance when we need to limit the line length of a code below 120.

Suggesting to escape new lines being added in a string template, by using the backslash \ character.

string name = "John";
string message = string `Hello ${name}, \
Welcome to Ballerina!`;

io:println(message); //output: Hello John, Welcome to Ballerina!

Suggested Labels:

Code sample that shows issue:

Related Issues:

jclark commented 1 year ago

Backslashes aren't significant in backtick strings. It wouldn't be compatible to change that, and I wouldn't want to in any case.

I think you can do this, if you really need to break a backtick string over lines.

string message = string `Hello ${
name}, Welcome to Ballerina!`;
string message = string `Hello ${name}, ${
""}Welcome to Ballerina!`;
hasithaa commented 1 year ago

@jclark, I agree with your point. Several developers have expressed their desire for a way to split the template syntax into multiple lines in order to adhere to formatting guidelines, specifically the 120/80 characters per line rule.

It is worth reconsidering our stance on this rule, as it has been followed since the early stages of Ballerina development, largely influenced by Java and other programming languages. While there is a historical reason behind this rule (such as limited terminal sizes in the early days), we should re-evaluate whether it is still a necessity with the modern IDEs and features like line wrapping views.

There are pros and cons to adhering to the 120/80 rule:

Considering these factors, we need to determine our recommendation regarding this rule, or should we introduce a language feature to support this use case in template expression? The Few people we have talked to are not happy with the current solution we have.

jclark commented 1 year ago

This issue is not about whether there should be a recommended limit on the length of Ballerina source lines.

If there is a limit, then the language already provides multiple ways to use string templates and keep within that limit. I mentioned two above. Another of course is to use string concatenation.

In any case, it is not possible to change string template syntax to recognize additional escape sequences without massive backwards compatibility. So that's a complete non-starter.

I can see only one possible minor change: make the expression inside curly braces optional, with a missing expression meaning that nothing is interpolated. This would allow

string`A very long line  ${
}can be broken.`

to be used to break lines.

I also question why people need to create long input lines in string template syntax. It must be either because

In the later case, they can break the input lines inside the curly braces. In the former case, if they want long output lines but they don't want long input lines, they will have to use one of the techniques I have mentioned: this does not seem an onerous requirement to me.

pcnfernando commented 1 year ago

Understood the massive backward incompatibility, if we are to introduce the \ escape sequence.

Another usage is when people use string templates in order to avoid the need for using escape characters. eg: https://ballerina.io/usecases/healthcare/

// The following example is a simple serialized HL v2.3 ADT A01 message.
final string msg = string `MSH|^~\\&|ADT1|GOOD HEALTH HOSPITAL|GHH LAB, INC.|GOOD HEALTH HOSPITAL|
198808181126|SECURITY|ADT^A01^ADT_A01|MSG00001|P|2.3||${"\r"}EVN|A01|200708181123||
${"\r"}PID|1||PATID1234^5^M11^ADT1^MR^GOOD HEALTH HOSPITAL~123456789^^^USSSA^SS||
BATMAN^ADAM^A^III||19610615|M||C|2222 HOME STREET^^GREENSBORO^NC^27401-1020|GL|
(555) 555-2004|(555)555-2004||S||PATID12345001^2^M10^ADT1^AN^A|444333333|987654^NC|
${"\r"}NK1|1|NUCLEAR^NELDA^W|SPO^SPOUSE||||NK^NEXT OF KIN${"\r"}PV1|1|I|2000^2012^01||||
004777^ATTEND^AARON^A|||SUR||||ADM|A0|`;

As an alternative approach, can we consider introducing a new syntax specifically for multi-line strings, while keeping the existing string template syntax unchanged? This way it indicates the continuation of the string template across multiple lines, but the result would be a single string. eg:


string _ = string ```Hello ${name}, 
                            Welcome to Ballerina!```
jclark commented 1 year ago

What's the problem with the mechanisms I've suggested (whitespace inside the curly braces and concatenation).

String templates are an alternative to string concatenation that are convenient in some situations. If there's a particular situation when string templates alone are not convenient, then use some string concatenation.

The healthcare example can easily be handled in a variety of ways by using some string concatenation and/or whitespace inside curly braces in backticks.

More string syntaxes are confusing for the user. I see no evidence that what we have is insufficient to handle most cases in a reasonable way.

sanjiva commented 1 year ago

The problem is that the proposed solution of using an empty expression to break the line is very ugly.

I think there is precedent for a single backtick meaning something and triple backticks meaning something else in markdown.

I'm not necessarily loving it but there is some precedent.

jclark commented 1 year ago

I agree it's not pretty, but this is only needed when you want long output lines and you don't want to have equally long source lines. I don't think that's a common enough case to motivate a beautiful solution involving new string syntax. There's also another solution available, which is to use string concatenation.

sanjiva commented 1 year ago

Certainly not a common case but in B2B stuff (with EDI, HL7) this could be common.

We don't allow strings to cross lines with a backslash either .. if we can relax that that's an alternative that's good enough. I can't remember why we said no to that ... I'm sure there was a good reason :).

hasithaa commented 1 year ago

String template syntax has some use cases, but using this for constructing a long text format is an overkill.

I think the best option is to use a function with var-args to construct this string. Inside the function, we can construct the string part by part and finally build the full string using string concatenation. This is more maintainable than having a large string template.

Btw while exploring the problem, I got to know that Python has implicit concatenation for string literals. Syntax-wise, that may cause problems for us. i.e.

string s = string `some large text` 
string `another large text`;
pcnfernando commented 1 year ago

String template syntax has some use cases, but using this for constructing a long text format is an overkill.

I think the best option is to use a function with var-args to construct this string. Inside the function, we can construct the string part by part and finally build the full string using string concatenation. This is more maintainable than having a large string template.

For instance, if a user wants to create a lengthy interpolated string a/o needs to avoid escaping special characters in the string literal, the recommended suggestions would be to use one of the below options ranked as per precedence until we conclude the issue

  1. Add line breaks at interpolation start/end tokens as James suggested
  2. multiple string-template-expr concatenations
     string  _ = string `Hello ${name}, ` +
                        string `Welcome to Ballerina!`;