nim-lang / RFCs

A repository for your Nim proposals.
136 stars 26 forks source link

Unquoted indentation-based string literals #248

Closed haxscramper closed 3 years ago

haxscramper commented 3 years ago

Proposal

Add unquoted indentation-based string literals with following syntax:

herestring:
  this is an unquoted text that will be treated
  as string literal

Note: I'm of course open to suggestions and comments about implementation - this is by no means final impllementation, I just added it to have something to show (since this RFC is purely syntax sugar).

Implementation

When identifier herestring is found during lexing replace it with triple string literal token. All text with current line indentation + 2 will be cut out as a string.

herestring:
  [ this code will be ]
    [ treated as string literal]
  ^^
  Indentation will be preserved

# This comment has indentation smaller than 0 + 2 and will
# be treated as regular comment

Example of use

I implemented some tests for initial implementation

String format

echo fmt herestring:
  Long text with some {interpolated} elements. You can of course
  write it as regular triple string literal, but then you would
  have to either de-indent it afterwards or move it the first column.

Just as string literal

Nim compiler test suite uses triple quoted string literals to configure test suite. This:

discard """
  errormsg: "expected: ':', but got: 'echo'"
  file: "tinvcolonlocation1.nim"
  line: 8
  column: 7
"""

can be turned into this

discard herestring:
  errormsg: "expected: ':', but got: 'echo'"
  file: "tinvcolonlocation1.nim"
  line: 8
  column: 7

which is not too much of a difference from feature standpoint, but a lot of people will find herestring version easier to work with + it looks better (this is of course subjective).

Emit statement

{.emit: herestring:
  try {
      auto tmp = (itr != itrEnd);
  } catch (const boost::wave::preprocess_exception& ex) {
      cerr << "ERROR in " << ex.file_name() << " : " << ex.line_no()
           << endl;
      cerr << ex.description() << endl;
      return 1;
  }
.}
Varriount commented 3 years ago

Aside from the indentation, what does this improve over triple quoted strings?

SolitudeSF commented 3 years ago

Github syntax highligher works relatively fine on the new syntax.

github highlighter breaks all the time with current syntax and this for sure wont make it better. also, i have no clue how would i add syntax highlighting with regex for this.

haxscramper commented 3 years ago

I don't think it is necessary to explicitly add highlighting code this specific new type of code literals. On the contrary - I say there is absolutely no need to highlight this as string literals, or as any kind of specific construct. I repeat what I already said in RFC - a lot of languages share keywords and syntax.

Here is an example how vscode currently handles it right now. I think it looks pretty good. The only thing that needs to be highlighted separately is a herestring: - can be treated as new keyword.

1 Better syntax highlighting in cases where it is necessary to have foreign code literals

Screenshot taken from VScode with zero additional configurations.

image

2 No indentation or unindent is necessary.

Currently, to have correctly indented string literal it is necessary to write

import strutils

echo """
  try {
      auto tmp = (itr != itrEnd);
  } catch (const boost::wave::preprocess_exception& ex) {
      cerr << "ERROR in " << ex.file_name() << " : " << ex.line_no()
""".unindent()

to get this string:

try {
auto tmp = (itr != itrEnd);
} catch (const boost::wave::preprocess_exception& ex) {
cerr << "ERROR in " << ex.file_name() << " : " << ex.line_no()

Since according to documentation of strutils.unindent it "Removes all indentation composed of whitespace from each line in s." (emphasis mine). Which means the string you get is not unindented but rather 'stripped on leading whitespaces', which is not the same thing. And with unindent string literals look like """ """.unindent() and require importing strutils.

It is of course a non-issue to write function to uniformly unindent string.

With new syntax it looks like this:

herestring: # Random C++ code
  try {
      auto tmp = (itr != itrEnd);
  } catch (const boost::wave::preprocess_exception& ex) {
      cerr << "ERROR in " << ex.file_name() << " : " << ex.line_no()
           << endl;
narimiran commented 3 years ago

Since according to documentation of strutils.unindent it "Removes all indentation composed of whitespace from each line in s." (emphasis mine). Which means the string you get is not unindented but rather 'stripped on leading whitespaces', which is not the same thing.

There's also another unindent, namely: proc unindent(s: string; count: Natural; padding: string = " "): string which works as intended when you pass the correct count:

import strutils

echo """
  try {
      auto tmp = (itr != itrEnd);
  } catch (const boost::wave::preprocess_exception& ex) {
      cerr << "ERROR in " << ex.file_name() << " : " << ex.line_no()
""".unindent(2)      # <----------- notice `2` here ------------------

produces:

try {
    auto tmp = (itr != itrEnd);
} catch (const boost::wave::preprocess_exception& ex) {
    cerr << "ERROR in " << ex.file_name() << " : " << ex.line_no()
haxscramper commented 3 years ago

Which requires you to specify indentation in every single string literal that you write in the code. It is perfectly doable and quite easy, but now string literals look like """ """.unindent(n), where n depends on indentation of the current code.

SolitudeSF commented 3 years ago

and how would that look?

herestring:
  #[

othercode()
haxscramper commented 3 years ago

Remark: the only language that I know of which treats #[ as start of multiline comment that I know of is nim.

Yes, of course it is possible to find innumerable edge cases to break every syntax highlighter possible, but the only two things that break everything down the road are not closed string and #[ / ##[ comment pairs.

haxscramper commented 3 years ago

Since majority voted against and there is no real feature benefit, but only new syntax sugar I think it is appropriate to close the RFC.

Araq commented 3 years ago

Thank you very much for your sportsmanship!