hvesalai / emacs-scala-mode

The definitive scala-mode for emacs
http://ensime.org
GNU General Public License v3.0
361 stars 68 forks source link

revisit the syntax regexes with scala sublime users #121

Closed fommil closed 6 years ago

fommil commented 7 years ago

@dickwall tells me that @djspiewak has been rewriting the sublime syntax regexes to be super efficient and false negative less on idiomatic code. We might want to sync up at some point so that we get all the benefits. Daniel, what license are you using and where can we find your regexes?

fommil commented 7 years ago

there are also a few places were the regex based matchers in emacs definitely consistently fail, e.g. marking type parameters as constants instead of types. ENSIME overrides that but it feels a bit heavyweight.

djspiewak commented 7 years ago

@fommil You can find them in the sublimehq/Packages. The license appears to be here, and it looks like a modified BSD. Where by "modified" I mean "most of the clauses deleted". So it's basically "public domain".

The changes are all in the Scala/Scala.sublime-syntax file. Note that I have several outstanding PRs which continue to improve on the situation.

A lot of the benefits of my changes come down to the use (and abuse) of a few discrete features:

So I don't know how much of this is applicable to emacs or even ensime, but it's a thing. :-)

fommil commented 7 years ago

@djspiewak cool, thanks! In raw emacs mode there is definitely a concept of local context matching. As for ensime, it gets the semantic information from the AST so no regexes needed. When ensime is enabled it overrides the pattern matching from scala-mode but you're right that sometimes it's useful to have fast regex matchers find the right semantics first before waiting (and effectively have ensime confirm the colouring) and of course there are people who use emacs without ensime so it's good for it all to match up as close as possible. We're far from that goal right now.

djspiewak commented 7 years ago

@fommil Yeah, I've given some thought to that from a Sublime standpoint (specifically, what would be the optimal way for ENSIME to interact with the mode). I think that what should happen, ideally, is the semantic highlighting would refine the scopes. The most notable place where this would happen is applying the variable.function.scala scope to any tokens which correspond to function invocations, and maybe a meta.coercion.scala scope (or something more imaginative) to expressions which are implicitly converted. That sort of thing. Basically all of the syntactic things which are highlighted by Sublime (at this point) are accurate, though there are a couple places (e.g. where a lambda declaration is broken by a newline) where we underapproximate in unidiomatic usage.

But broadly speaking, the fact that we can just toss more scopes on what is already there (which color schemes can choose to highlight or just ignore) is very powerful, and it will eventually allow the semantic highlighting in Sublime ENSIME to be quite advanced and also gracefully and performantly fall back on the (now quite accurate) core mode.

hvesalai commented 7 years ago

Are the regexes at all comparable at the moment? I.e. is this something that can actually be accomplished in finite time?

djspiewak commented 7 years ago

@hvesalai They're probably not directly comparable, but most of the basic stuff should be easily converted. More complex stuff like the type environment, lambda lookahead, etc might not be easily achieved. Most of the basic stuff in the new sublime mode is actually taken from the Scala Specification (e.g. definition of numeric literals, variables, etc).