Closed haxscramper closed 3 years ago
Aside from the indentation, what does this improve over triple quoted strings?
Github syntax highligher works relatively fine on the new syntax.
github highlighter breaks all the time with current syntax and this for sure wont make it better. also, i have no clue how would i add syntax highlighting with regex for this.
I don't think it is necessary to explicitly add highlighting code this specific new type of code literals. On the contrary - I say there is absolutely no need to highlight this as string literals, or as any kind of specific construct. I repeat what I already said in RFC - a lot of languages share keywords and syntax.
Here is an example how vscode currently handles it right now. I think it looks pretty good. The only thing that needs to be highlighted separately is a herestring:
- can be treated as new keyword.
Screenshot taken from VScode with zero additional configurations.
unindent
is necessary.Currently, to have correctly indented string literal it is necessary to write
import strutils
echo """
try {
auto tmp = (itr != itrEnd);
} catch (const boost::wave::preprocess_exception& ex) {
cerr << "ERROR in " << ex.file_name() << " : " << ex.line_no()
""".unindent()
to get this string:
try {
auto tmp = (itr != itrEnd);
} catch (const boost::wave::preprocess_exception& ex) {
cerr << "ERROR in " << ex.file_name() << " : " << ex.line_no()
Since according to documentation of strutils.unindent
it "Removes all indentation composed of whitespace from each line in s." (emphasis mine). Which means the string you get is not unindented but rather 'stripped on leading whitespaces', which is not the same thing. And with unindent
string literals look like """ """.unindent()
and require importing strutils
.
It is of course a non-issue to write function to uniformly unindent string.
With new syntax it looks like this:
herestring: # Random C++ code
try {
auto tmp = (itr != itrEnd);
} catch (const boost::wave::preprocess_exception& ex) {
cerr << "ERROR in " << ex.file_name() << " : " << ex.line_no()
<< endl;
Since according to documentation of
strutils.unindent
it "Removes all indentation composed of whitespace from each line in s." (emphasis mine). Which means the string you get is not unindented but rather 'stripped on leading whitespaces', which is not the same thing.
There's also another unindent
, namely: proc unindent(s: string; count: Natural; padding: string = " "): string
which works as intended when you pass the correct count
:
import strutils
echo """
try {
auto tmp = (itr != itrEnd);
} catch (const boost::wave::preprocess_exception& ex) {
cerr << "ERROR in " << ex.file_name() << " : " << ex.line_no()
""".unindent(2) # <----------- notice `2` here ------------------
produces:
try {
auto tmp = (itr != itrEnd);
} catch (const boost::wave::preprocess_exception& ex) {
cerr << "ERROR in " << ex.file_name() << " : " << ex.line_no()
Which requires you to specify indentation in every single string literal that you write in the code. It is perfectly doable and quite easy, but now string literals look like """ """.unindent(n)
, where n
depends on indentation of the current code.
and how would that look?
herestring:
#[
othercode()
Remark: the only language that I know of which treats #[
as start of multiline comment that I know of is nim
.
Yes, of course it is possible to find innumerable edge cases to break every syntax highlighter possible, but the only two things that break everything down the road are not closed string and #[
/ ##[
comment pairs.
Since majority voted against and there is no real feature benefit, but only new syntax sugar I think it is appropriate to close the RFC.
Thank you very much for your sportsmanship!
Proposal
Add unquoted indentation-based string literals with following syntax:
lexer.nim
, contained in ~60 lines of code.Note: I'm of course open to suggestions and comments about implementation - this is by no means final impllementation, I just added it to have something to show (since this RFC is purely syntax sugar).
Adresses some points from different RFCs such as #210, #161
161 - fully covers this RFC albeit with different syntax. Instead of introducing special handling for relatively cryptic char combination
":
string is introduced asherestring:
- more verbose but easier to recognize.210 - allows to write unquoted code for
{.emit.}
statement. It might not look as good as solution that specifically addresses particular backend, but syntax highlighting is will work.Comment about possible implementation - instead of
'''
identifierherestring
is used. Github syntax highligher works relatively fine on the new syntax. Given that sole reason for unquoted herestring is to have some basic syntax highlighting/editor support (even though nim is different from say C, it shares some keywords so highlighting works (C++ code literal in example)) it is still better than having no highlighting at all.Implementation
When identifier
herestring
is found during lexing replace it with triple string literal token. All text with current line indentation + 2 will be cut out as a string.Example of use
I implemented some tests for initial implementation
String format
Just as string literal
Nim compiler test suite uses triple quoted string literals to configure test suite. This:
can be turned into this
which is not too much of a difference from feature standpoint, but a lot of people will find
herestring
version easier to work with + it looks better (this is of course subjective).Emit statement