wooorm / lowlight

Virtual syntax highlighting for virtual DOMs and non-HTML things
MIT License
746 stars 26 forks source link

Handle subscopes differently/better? #43

Closed joshgoebel closed 2 years ago

joshgoebel commented 2 years ago

Just saw that you're flattening scopes... why not preserve them?

IE: title.function becomes just "title" and "function" separately, which isn't really the same thing at all.

It seems like you're purposely losing data... and I'm not sure the benefit since you don't have to care about CSS or anything...

wooorm commented 2 years ago

Can you clarify what you mean? What actual output should be what expected output? Presumably you mean some change here? https://github.com/wooorm/lowlight/blob/1eb528641806841debb14d25d2c622e773cebe3c/lib/core.js#L267 Do you mean that CSS themes are broken?

joshgoebel commented 2 years ago

I thought you didn't generate HTML? [bit confused] But yes, they would be unless you're rewriting the CSS yourself, because the CSS handles subscopes differently.

https://highlightjs.readthedocs.io/en/latest/css-classes-reference.html#a-note-on-scopes-with-sub-scopes

See how we expand names:

https://github.com/highlightjs/highlight.js/blob/main/src/lib/html_renderer.js#L30

IE title.class.other =>

.hljs-title.class_.other__ {
   color: blue;
}

But for your AST output I'm not sure why you would want to transform the scope name at all, vs using it verbatim.

wooorm commented 2 years ago

ASTs represent code. This project generates ASTs that represent HTML. So that means the output of this project can be serialized as HTML. The tests here sometimes do that.

Thanks for pointing me towards this highlight.js feature.

joshgoebel commented 2 years ago

Added right before version 11:

https://github.com/highlightjs/highlight.js/commit/51806aa677f86786142a3db93a5c98fd02116c9f

What is the reason that those classes don’t get a prefix (hljs-)?

The _ suffix naming makes them less likely to conflict, plus our CSS always scopes them under a hljs top-level class... so the prefix isn't necessary because it's a given from the way the CSS rules should be written. I suppose someone could write a custom rule that only targets class_ (rather than hljs-title class_), but if they did [and if there was a conflict] I would say that is an unsupported edge case.

What is the benefit of adding underscores to them, instead of letting CSS combine classes:

There is a longer thread somewhere with me and Prism maintainer talking about pros and cons of different approaches. In a perfect world we could style "title" with CSS and then also "title.class" while title.class inherited from title... but you can't do this with CSS without separate rules, so we have a title rule and a class rule. But a "title.class" is NOT necessarily the same as a "class.title" so the sequencing matters. Hence coming up with the _ naming strategy.

This strategy is backwards compatible with custom themes also (that just won't get the nuance of subscopes). (vs something that rewrote say title.class.inherited => title_class_inherited, etc. (plus no easy way to do inheritance there)

So a title.class.inherited becomes.

This way the class is applied but the ordering is ALSO taken into account.

joshgoebel commented 2 years ago

This project generates ASTs that represent HTML. So that means the output of this project can be serialized as HTML.

I believe the correct behavior is to generate data with an (unaltered) scope key not class... and someone converting to HTML would be responsible to get the classes right. (or you could always provide both). Since we switched to scope from className (with version 11) abstract AST nodes do not have a class, they have a scope.

As far as what is a breaking change for you or backwards compatible for your users, I'll leave that to you.

wooorm commented 2 years ago

I believe the correct behavior is to generate data with an (unaltered) scope key not class.

lowlight generates hast: an HTML AST. It doesn’t generate a highlight.js AST. It can only do what HTML supports (class). Not what highlight.js supports (scope).

wooorm commented 2 years ago

Thanks for raising this and explaining how it works, I fixed it!

I will add that I find it odd that prefixes are not supplied to some classes and that I don’t see a practical reason for underscores as suffixes, because I don‘t see a practical reason to differentiate between .title.function and .function.title on a span.

joshgoebel commented 2 years ago

I don‘t see a practical reason to differentiate between .title.function and .function.title on a span.

As I think I mentioned not a great example... sub-scopes are meant to be hierarchal (exactly like TextMate scopes) and hence in the abstract a.b.c is NOT at all the same scope (at all) as c.b.a - and possibly should be highlighted entirely differently... so the system was designed to correctly preserve this hierarchal nature even down into the CSS layer.

No one's reported any real issues with the approach yet, other than that it's perhaps a bit unfamiliar looking. :-)

Glad you got it fixed though!

wooorm commented 2 years ago

As I think I mentioned not a great example.

I attempted to match that with my practical vs. theoretical; I would appreciate hearing of a case where it is important.

joshgoebel commented 2 years ago

Since 3rd party grammars can technically use whatever scope names they desire it seems more than theoretical and I didn't want a disclaimer explaining that you can't have both a.b.c and c.b.a without things breaking badly. We also already have real conflicts with 3rd party themes (or custom themes) that use the old style function and class scopes vs the newer title.function and title.class... these are not always even semantically the same things.

Sometimes class and function were used to refer to the entire class definition or entire function body... applying the class scope coloring to title.class would be entirely incorrect in those cases. Hence needing a way to differentiate top-level class from the more nuanced (and potentially entirely different semantic) title.class.

This is also partly related to semantic versioning and my desiring to solve this problem completely the first time rather than find out in the middle of the v11 cycle that there are cases where it breaks badly and then needing a breaking change to fix things properly. As you know it's also fairly simple for someone to hack into the parser if they TRULY need a simpler output and are confident that the CSS scope overlapping problem is one they are -guaranteed- to be safe from.

That said, I haven't heard from anyone having any real issues with this system at all since it's release. :-)

wooorm commented 2 years ago

Since 3rd party grammars can technically use whatever scope names [...]

Given that highlight.js contains the grammars that most folks will use, and that it contains the themes that most folks will use, presuming that there were no conflicts in these themes (?), I would probably be fine being like: yo this is a theoretical limitation that in esoteric cases could become a thing, in which case, change the names to be more specific 🤷‍♂️ Instead of having to document/explain to theme makers that class.title has nothing to do with title.class!

That said, I haven't heard from anyone having any real issues with this system at all since it's release.

That seems to align with my thesis that nobody is running into this, practically 😅


fwiw, I totally understand not wanting to break semver, erring on the safe side, and historical facts of life!

joshgoebel commented 2 years ago

Given that highlight.js contains the grammars that most folks will use,

I don't really accept that premise [that 3rd parties can be ignored] as more and more 3rd party grammars are added over time.

presuming that there were no conflicts in these themes (?),

Except there are, some first party grammars still use function and class (so we support both old and new forms)... fixing all ~200 grammars at once (with new rules) is simply too much effort - so they are fixed as they are touched.

That seems to align with my thesis that nobody is running into this,

Well, no one would run into it today because I solved it. My point was no one seems to really have any issues with the new system either.

Instead of having to document/explain to theme makers

Scopes are fundamentally different than CSS classes, so the documentation is worth it IMHO.


I would probably be fine being like: yo

Yes, I definitely see your point, it's just not the way we went.

wooorm commented 2 years ago

btw I just saw this, which discusses a very similar class problem, though you might be interested

joshgoebel commented 2 years ago

Thanks, but I think using/abusing attribute selectors in such a way is worse than what we decided to already go with. :)