Open deathaxe opened 3 years ago
I would vote for meta.string.regexp string.quoted.single
style, partially because most color schemes already target string.quoted
, there's less combinatorial explosion of scopes for color schemes to target, and partially because I think specializing the meta scope just makes more sense here.
I'd really welcome a common scoping guideline for regexp strings.
I think the vast majority of color schemes target the more general string
scope, instead of only string.quoted
- and for those which don't, I would classify potential problems from that as a color scheme issue, because string
is one of the scopes in the recommended minimal scope coverage from the scope naming guidelines.
In general, like @keith-hall I would prefer the meta.string.regexp string.quoted
scheme too, because it makes more sense to have the "regexp" property of a string separated from the quoting style. But I think it might cause noticeable problems for backwards compatibility. string.regexp
is already explicitly specified in the scope naming guideline as the scope name to use:
Regular expression literals should use:
string.regexp
And there are a lot of color schemes which target this scope. I just checked a few color schemes and could find that scope in all of them, for example Solarized, Dracula, Base16, Nord.
I found following string.regexp
scopes in this repo.
scope | comment |
---|---|
string.regexp.clojure | "pattern" |
string.regexp.groovy | /pattern/ . |
string.regexp.javascript | /pattern/ . |
string.regexp.ruby | /pattern/ , {...} ... |
string.regexp.perl | /pattern/ , m{pattern} , ... |
string.regexp.modr.sql | %r{ } |
string.regexp.sql | /pattern/ |
string.regexp.tcl | depends on command |
Finally Clojure seems the only effected with regards to string.quoted.double
when we talk about possible highlighting changes.
PHP and Python don't use string.regexp
to highlight regular expressions in quoted strings. They already give string.quoted
precedence.
I am uncertain about all those custom perl style patterns at the moment. Perl knows about q/literal/
or s/pattern/
. It only scopes the tokens between /
as string.unquoted
vs. string.regexp
, while q
and s
are functions.
Maybe something like this is the way to go for those kinds of string constructs without too heavy impact.
Finally Clojure seems the only effected with regards to string.quoted.double when we talk about possible highlighting changes.
Oh, I was under the assumption that one goal was to use a common scope for all regular expressions, regardless if they are delimited by e.g. r"pattern"
or /pattern/
. That would make it easier for color schemes to treat all regular expressions in a certain way. But if Closure is the only syntax in question for a change (in addition to removing source.regexp
in general), then I see no problem to adjust string.regexp.clojure
to string.quoted.double.closure
. That would mean to have meta.string.regexp
only in case when the regexp is not of type /pattern/
, if I understand it correctly?
We might also want to distinguish whether the special characters/elements in a regexp are scoped, or not.
From personal experience with my color scheme, I tried to remove the usual string color in that case, to allow the regexp characters/elements to be highlighted like code. But if there is no special scoping within a regexp string, I'd like to keep the string highlighting color. IIRC, this was often possible via the source.regexp
scope, but I probably could easily change it if meta.string.regexp
will be used instead then.
The primary reason for this RFC is an attempt to properly support string interpolation in PHP in a way we do it in other syntaxes already, by clearing string
scope an keeping just meta.string
so interpolated variables/expressions are highlighted as code without special treatment by color schemes.
The main issue I am faced with is those quoted regular expressions using meta.string string.quoted source.regexp
, which means interpolation contexts must clear 2 scopes to get rid of string
.
As a result we'd always need special interpolation contexts for regular expressions, because we only need to clear 1 scope in normal quoted strings.
While I already found a quite practical solution to achieve it without duplicating any interpolation pattern, the general idea just was to find a scope scheme which only uses two scopes for patterns as well, to avoid possible issues in other syntaxes.
Example of current approach:
$pattern = "/^text[0-9] ${interpolation}/m"
^^^^^^^^^^^^^ meta.string string.quoted source.regexp - meta.interpolation
^^^^^^^^^^^^^^^^ meta.string meta.interpolation source.regexp - string
^^^ meta.string string.quoted source.regexp - meta.interpolation
If this ends up in a common "style guide" for all patterns, I'd be ok with as well. I am not focused on a certain approach/solution/idea.
We also have your https://github.com/sublimehq/Packages/issues/1942 about how to treat/scope regexp content itself.
I agree with the "desirable result" of being able to write simple color scheme rules, which apply well and equally to all syntaxes.
I just was not keen enough to put this large scope into this RFC in the first place.
Prelude
Several syntax definitions implement string interpolation by clearing
string
scope.The goal is to enable simple color schemes to properly highlight interpolated variables or expressions as well as embedded source code without special treatment.
Embedded syntaxes, such as JavaScript or CSS in HTML tag attributes look like:
Note: The string scope is cleared between quotation marks.
A quoted string with variable interpolation looks like:
Note: The string scope is cleared whenever a
$...
interpolation is consumed.Quoted regular expression strings currently mix both use cases above. The whole string is scoped
meta.string string.quoted source.regexp
in Python and PHP for instance.The issue
The examples basically reveal two different kinds of issues:
string
andsource
. The way scopes are stacked current interpolation scheme requires two scopes to be cleared.Issue 1 can probably be solved by using
meta.embedded
vs.meta.interpolation
.Issue 2 probably requires some discussion about how to solve the problem.
Proposal
As
source.regexp
is only applied if regular expressions are implemented in external syntax definitions (see: Python, PHP, Perl) but not if they are part of the syntax itself (e.g.: Bash), a first step would probably be to find a common transparent scoping scheme for both of them.The primary goal should/would be to apply the same "string-interpolation" contexts used for normal quoted strings to avoid duplicating numerous contexts.
This ends up in removing
source.regexp
.It would enable normal interpolation contexts to be used, which currently clear 1 scope to remove
string
.A new question arises though: How to scope regular expressions then?
Some syntaxes apply
string.regexp
to regular expressions, but quoted strings already usestring.quoted
.Color schemes might want to treat regular expressions special, as they require code highlighting as normal source code.
We could use
meta.string string.regexp.quoted
ormeta.string.regexp string.quoted
.Any ideas?