Closed deathaxe closed 2 years ago
Your knowledge and help is greatly appreciated.
Thanks a lot for your time and putting everything in a PR! That's awesome!
Are we happy to merge this one?
Certainly, but let me check a bit nevertheless. :) I saw that the vertical order of the test lines is not consistent with the way I did it in every other test. We should change that to be consistent I think.
Yup. Was both getting your confirmation and making sure @deathaxe has not more to add at this stage. 😄
We should change that to be consistent I think.
I can change that back, if that's your style. Wasn't aware of it, tbh.
Would like to include some changes to Surface as well as
I am currently a bit lost at the question of how to correctly name those embedded stuff in various situations.
Main question is meta.embedded
vs. meta.interpolation
. This question is not limited to this PR but more general.
ASP/PHP use meta.embedded source.xyz.embedded.html
to express ASP/PHP sources within <?...?>
or <%...%>
sections.
During recent ST4 dev cycle we added several string interpolation features to improve highlighting and auto-completions behavior (ST disables auto-completion and various other stuff in strings). So we end up in something like meta.string meta.interpolation
for source code interpolated into strings. I am uncertain whether that's appropriate for those <%...%>
sections. Especially as it causes a mix of punctuation scopes.
Another question arises with a look at HEEx and Surface, which support those {...}
interpolations. Do they have the same meaning as <%...%>
sections but with different syntax, or is distinction needed scope-wise?
Finding a consistent solution for all syntaxes might take another roundtrip, IMHO.
Is something like <ta{@x}g />
valid in HEEx / Surface? I saw, <{@x}>
not to be in Suface, so I guess such interpolations must not appear in tag names at all?
Should be done, except noted aspects, I'd like to read some input from elixir specialists.
tag-other-name
contexts can be removed from HEEx/Surfacemeta.embedded
vs. meta.interpolation
might be worth some thoughts. Anyway, punctuation should be consistent with choosen meta scope. So punctuation.section.embedded
vs. punctuation.section.interpolation
.Is something like
<ta{@x}g />
valid in HEEx / Surface? I saw,<{@x}>
not to be in Suface, so I guess such interpolations must not appear in tag names at all?
I just checked with Surface. It considers all those characters to be part of the tag. So no interpolation. My guess is HEEx does the same. Btw, I tried something like ~F"<a {=@x}g />"
. It parses to ["<a", " g></a>"]
. The trailing g
becomes an independent attribute.
Cool! Will review everything. Thanks for making it clear with great commit messages and a clean history! There's also definitely some stuff to think about.
I didn't know that auto-completion doesn't work when it's embedded inside a string unless the scope is cleared. I never noticed anyway, because I configured ST to always show me suggestions in any scope...
Sure. I've mainly gone through formal stuff without deep knowedge about the syntax itself. It may still contain some bugs or I may have broken something.
Another major aspect with clearing string scope is correct highlighting of otherwise unmatched tokens in embedded source. They would be highlighted as string without clearing.
I just saw some tests like <{}}>{@x}</{}}>
. Are they really accepted as tags? According to whatwg a tag name must always start with an ascii letter, so if no interpolation is supported, they would be invalid, IMHO.
I just saw some tests like
<{}}>{@x}</{}}>
. Are they really accepted as tags? According to whatwg a tag name must always start with an ascii letter, so if no interpolation is supported, they would be invalid, IMHO.
Yep, they're "accepted". I think the parser just checks if everything is syntactically correct, not so much if it makes semantic sense.
Thanks for sorting the tests and removing the tag interpolations! Question regarding atomic groups: should we really remove them now because I think they're pretty useful and maybe ST regexps will support them in the future? I haven't written a regexp engine yet but atomic groups seems like a trivial feature to me. Besides, all the other syntax files use them too and I remember in some cases they're necessary...
Appears I missed to tweak some tests, again.
It's very likely ST's sregex engine won't support atomic groups anytime soon. A modern syntax should avoid anything which causes Syntax Tests - Regex Compatibility
command to generate any warnings to make sure to avoid Oniguruma to be triggered.
Those advanced features may have been needed in tmLanguage syntaxes and therefore still exist after those have been converted to sublime-syntax format, but with all the context magic at hand in sublime-syntax, patterns are often simple enough to not need all those sophisticated regex features. One reason to rewrite those converted syntaxes from scratch.
The recent commit adds some changes to illustrate atomic groups can easily be replaced by simpler patterns in various situations.
Hey, I feel a bit bad about it but I squashed most commits and re-based onto master. I consolidated (almost) all the commit messages. Hope that's alright!
I made only superficial content changes:
I pushed to deathaxe/proposals
. You can push force if you think it's good. :+1:
You don't need to feel bad at all. It's your project. I am here to offer some proposals, only.
The other way round I have mixed feelings as I introduced so many changes to the already very good syntax definitions. 😄
There are lots of great changes and we appreciate it! You managed to delete a lot of unneeded lines as well. I always love that about refactoring. :smile:
I saw there's a possible optimization in surface-tag-name-common-pop
and heex-tag-name-common-pop
:
- match: '[[:lower:]_]\w*[?!]?|(?!\.)'
scope: variable.function.surface
pop: 1
- match: \.
scope: punctuation.accessor.dot.surface
What do you think? Is it better to leave it be?
In which way is it an optimization? ST compiles those patterns into one optimized regexp anyway and immediately pops if none of the consuming patterns match. Can you prove this negative lookahead has any positive effect on performance or something like that? I always found negative lookaheads to slow things down, so try to avoid them. That's also what a benchmark against 5k lines of the following snippet reveals.
<MyApp.User.tag name={@name}>
Your solution is a bit slower.
The current solution requires 47ms on my box, while yours takes 50ms.
Cool, thanks for going to the trouble of benchmarking it! It was an assumption I made that we'd save some CPU cycles if we removed the immediate-pop
rule. I thought ST would have one fewer regexp/rule to check. I stand corrected. :+1:
Found something: we should probably change punctuation.definition.tag.begin.surface
to punctuation.section.embedded.begin.surface
(and end
respectively). Would be consistent with (H)EEx. Right?
I squashed your recent changes into the corresponding commits. I found something else and added two commits on top of our branch. I think we should make the Elixir code sections in Surface, e.g. {elixir_code}
, "embedded" instead of "interpolated". Seems consistent with the way it's in HEEx. WDYT?
Maybe those meta.embedded
vs. meta.interpolation
should be consistent between syntaxes. So changing it in Surface would result in HEEx to be updated as well? Maybe all such elixir code sections should be treated as embedded
while leaving interpolation
to normal string interpolation, if supported by the language.
The set
statement should be a push
as otherwise the closing }
would try to pop the main context off stack.
Maybe should think about the final scopes as well. Not sure if something like punctuation.section.embedded.heex
is desirable in a surface syntax.
I consider Surface to be sufficiently different from HEEx, so I don't think we should have heex
or eex
scopes in Surface. On the other hand, HEEx and EEx are very similar and it's good if they share as much as possible. Afaics that's the case now.
Maybe all such elixir code sections should be treated as embedded while leaving interpolation to normal string interpolation, if supported by the language.
Do you mean <tag attr={...}>
vs. <tag attr="xxx {...} xxx">
? Maybe the former should be an embedded section instead, right? I'm not sure. It seems fine to leave it be an interpolation.
Question regarding HTML comments in Surface, e.g.: <!-- This {comment} will be sent to the browser -->
Shouldn't the comment scope be cleared in the interpolation so that auto-completion works?
I don't think we should have heex or eex scopes in Surface
Agree, this also makes naming meta.embedded.<language>
easier as it looks odd to have heex
in a surface file.
Do you mean ...
Yes. The main question is, what's the difference between embedded
and interpolation
in general. Basically a template file such as php/jsp/asp/ex/heex/surface/.... is just a big string, which is processed by an interpreter. The embedded
code just expands the template string to an output format. That's exactly the definition of string interpolation as well. The point is from the view of the template processor there's no difference between the embedded
code appearing within a quoted string (attribute value) or somewhere else. So the question is why scope it different?
That's a quite general question, which may effect several syntaxes.
I've tried to summerize the issue in an RFC at https://github.com/sublimehq/Packages/issues/3051
Question regarding HTML comments in Surface
You are right, in general it would make sense to clear the comment scope. Unfortunatelly there's no meta.comment
at the moment (the counter part of meta.string
). On the other hand some syntaxes (e.g.: Brackets) use the comment scope to dimm highlighting - not consistent with string scopes, I know.
That's just a part which hasn't been discussed so far.
I've introduced meta.comment
in Perl to be able to clear comment scope in embedded code blocks, but that's the only syntax which makes use of it so far. Maybe worth another RFC?
Is this ready to go in? @deathaxe @azizk
@princemaple Not yet but almost! Was thinking about the meta-scope stuff. Will write again in a bit...
Hi @deathaxe and @azizk ,
I moved this repo into an org and sent you two invitations.
I enjoy coding in Elixir and don't see my self leaving Elixir anytime soon. However, you two obviously have more knowledge on Sublime syntax than I do, and use more Elixir features than I do. (I'm not currently onto Heex or Surface yet) It only makes sense to give you two more access to the repo.
That said, feel free to reject if you don't need another org/repo to worry about. 😄
Thanks for the great works.
Po
Thanks @princemaple! I accept, of course. In the near future I might suggest to José to take it under his elixir-lang
umbrella. What do you think?
In the near future I might suggest to José to take it under his
elixir-lang
umbrella
I mean, no objections from me, but it's probably not going to happen TBH. The language doesn't have to care about specific editors.
Yeah, let's see. It's worth a try. :)
So one last off-topic question: Will you continue handling the package.io releases, at least for some time?
https://github.com/elixir-editors this is where it will more likely end up at.
I'm happy to indefinitely maintain the project. Releasing a new version is as simple as tagging a commit with correct format. The point is, you (and deathaxe if accepted) will be able to merge PR at your own pace. And make release when you see fit. I'm not sure if collaborators on personal repos have these access.
Oh, yes, I actually meant elixir-editors
instead of elixir-lang
. Guess I was confused by tree-sitter-elixir
which was newly added to elixir-lang
.
Ok, thanks a lot @princemaple! I accepted the invitation. It's cool if you accept as well @deathaxe, but you don't have to be active really. Maybe you can function as a backup maintainer if you like. Of course, no problem if you decline. :+1:
I probably won't have much to add. I am busy enough with core syntaxes, MarkdownEditing and some other repos. You are welcome to ask questions and I can add some PRs, but I don't see me maintaining this repo anytime soon. I don't use or know Elixir at all. My motivation for this PR is to help with applicating core HTML syntax using inheritance.
Huh, that's very impressive. I thought you'd be using it somehow if you had been thinking about its syntax.
You are welcome to ask questions and I can add some PRs, but I don't see me maintaining this repo anytime soon.
Noted and thanks! Appreciate your work on core.
Totally understand, @deathaxe!
Alright, back to the PR:
I'm not sure about the distinctions between interpolation
and embedded
. I read your RFC ticket #3051. It's a great summary of the issue. It seems to be a little dilemma without an obvious solution. Maybe it's best to keep it simple and make everything embedded
when inside a template file and use interpolation
strictly for Python f-strings or Elixir string interpolations and such.
Regarding the meta.embedded.*
scopes: it's good to have different ones for (H)EEx and Surface, just the way you did it a few days ago. I would suggest not to complicate things and use source.elixir.embedded.html
in all template syntax files. It makes styling those sections easier and it reads better because you can say it's elixir
inside html
. I don't think we need to be that specific and say it's elixir
inside eex
or surface
.
Regarding meta.comment
: I will add it to the syntax file. I think it makes a lot of sense.
I will make some changes on my branch so we can compare and then finalize the PR. :+1:
... use
source.elixir.embedded.html
in all template syntax files ...
I can follow your arguments and would be fine to use that general scope.
The main reason for finalizing the scope with eex
or surface
is the main scope being html.[eex|surface]
. So basically eex
and surface
are specialized forms of a html template. As those embedded code blocks are introduced by this special variant those might deserve its final scope.
Another aspect not yet discussed is supporting embedded elixir code blocks in js/css. The current solution would allow to just include those contexts in an extended CSS/JS syntax. You can have a look at Java Server Pages in https://github.com/sublimehq/Packages/pull/2654 to learn how it would work.
Finalizing the embedded code's scope with HTML would mean to do so with css/js in those syntaxes for consistency reasons, which ends up in duplicating all those contexts. That's the approach which is taken at https://github.com/sublimehq/Packages/pull/2789.
TBH, I am uncertain about which one is the better solution. Both are correct.
If the source.elixir.embedded.html
scope is used, the related meta.embedded
and punctuation
scopes should probably be finalized with html
as well.
Oh, I see! There's a bit more to consider then. Let me think about it. :slightly_smiling_face:
Finally got around to working on this again and made good progress. I noticed that we made some fundamental syntax mistakes. Embedding code isn't supported for attribute names and inside string values like this:
<div {@some_attr}="value" style="{@value}" />
Besides that, I decided to simplify the scope naming issue and to just use source.embedded.elixir.html
in all related syntax files.
Q: In the Surface file in the tag-generic-attribute-value
context I couldn't figure out why it needs the else-pop
even though meta_prepend
is true
and in the HEEx file it works without it.
I think we're almost done. Maybe a final review round? :)
I couldn't figure out why it needs the
else-pop
My most recent local copy of what I've pushed upstream doesn't contain tag-generic-attribute-value
nor does it contain else-pop
in it.
It's something you must have added. Two mistakes:
else-pop
needs to be removed.tag-generic-attribute-value
must be replaced by tag-generic-attribute-value-content
If embedding elixir code in attributes is not supported the content is probably not needed to be extended at all?
Embedding code isn't supported for attribute names and inside string values
Does it apply to both HEEx and Surface?
My most recent local copy of what I've pushed upstream doesn't contain
tag-generic-attribute-value
nor does it containelse-pop
in it.
Ah, sorry, I didn't mention it. I force pushed to our deathaxe/proposals branch.
Does it apply to both HEEx and Surface?
Yes. So tag-generic-attribute-value-content
isn't needed in either syntax file. I checked the source code of Surface and tested HEEx in the console to make sure. I wonder why I thought this was supported. Maybe earlier versions had a bug?
I managed to fix the else-pop
issue. Just needed an elixir-embedded-pop
context.
Another issue I encountered is that attributes like id
, class
etc. are handled specially. I just made a simple rule to check whether an attribute name is followed by a ={
to know whether it's an embed value or not. The disadvantage is that the attributes don't get the special scope names.
FYI it's best not to force push a branch that multiple people are working on. Also prefer --force-with-lease
so you avoid overwriting someone else's work.
If the purpose was to organize the commits, it's best done at the very end when you want to merge it in. And it's safer to create a new branch on top of it before you squash.
I've always been using --force-with-lease
. :+1:
Yeah, the way I handled this here wasn't optimal. I should have pushed onto deathaxe's branch to keep the changes visible for some time.
But you make it impossible to track changes by this kind of abusing git. That's not the purpose on version control!
Please don't force push to my branch making it look broken by me!
I just made a simple rule to check whether an attribute name is followed by a ={ to know whether it's an embed value or not.
Why bothering if embedding is not supported in attribute values?
I checked the source code of Surface and tested HEEx in the console to make sure. I wonder why I thought this was supported.
To be sure to understand correctly I just checked https://surface-ui.org/template_syntax.
The very first example shows embedded elixir code in html tag attributes. So why do you think it is not supported?
The following is status quo of this branch:
Surface:
<% %>
embedding in attributes{ ... }
is highlighted in attributes and contentHEEx:
<% %>
embedding in attributes{ ... }
is highlighted only in attributesEEx:
<% %>
embedding in attributes{ ... }
not part of the syntaxBut you make it impossible to track changes by this kind of abusing git. That's not the purpose on version control!
I know, I know. It wasn't the best thing to do, but I don't think I abused it. I only ever worked on my branch. You can still compare changes by diffing your local branch with my remote branch, or by selecting the top commits in Sublime Merge.
Please don't force push to my branch making it look broken by me!
Never pushed let alone force pushed anything to your branch. I only did it on my branch because I thought we'd be finished soon and only a ff-merge would remain to be done. Next time I'll be more careful, create normal commits and discuss squashing beforehand when necessary. :)
Why bothering if embedding is not supported in attribute values?
Embedding is supported as values but not inside the strings. In my example above, the embedding is inside the string. That wasn't supported or isn't anymore.
<!-- Invalid: -->
<div {@some_attr}="value" style="{@value}" />
<!-- Valid: -->
<div {[class: "x"]} style={@style} />
The recent syntax changes look very good and have a nice and clean structure.
Here are some ideas and proposals for HTML (EEx).
What needs consideration is embedding code in strings/attribute values needs slightly different handling than embedded code in other locations of HTML. The reason is the string scope which needs to be cleared so embedded code gets correctly highlighted and auto completion is enabled. On the other hand we don't want to clear any scopes if code is embedded outside of strings as this would very likely clear the main scope of the syntax. That's not desirable.
To achieve that two types of "embedded" contexts are required. There's no real guideline at the moment, but we may probably call normally embedded contexts "embedded" and those in strings "interpolation". We might also use interpolation vs. string-interpolation or something along those lines.
I haven't had a look at the other syntaxes of this packages, but I personally find it a bit strange to call
<%
a tag. With regards to ASP/JSP/PHP I'd suggest to scope all of<%
and%>
punctuation.section.embedded
.Some more comments about this can be found in commit messages.
Disclaimer: You asked for comments ;-)