Open Kiiyya opened 1 year ago
(This is awesome to discuss as far as I'm personally concerned, thanks for pushing it forward)
Will throw in an additional consideration which probably seems related -- perhaps some semantic highlighting of the checkpoint
tactic is useful? There's this old Zulip thread that I was keeping an eye on with a comment from Leo saying:
In the future, it would be great to have UI support to mark the part we are working on.
Perhaps signaling such a thing as part of the semantic highlighting (by e.g. differentiating between tactics which are above the checkpoint) is also something worth thinking about.
To throw out another brainstorming idea, perhaps some Lean API that allows someone writing a tactic (or some other more general piece of syntax) a way of tweaking how it will be semantically highlighted will be useful as well.
Hey Max, just wanted to say we're aware of the RFC and appreciate your work! I believe we definitely want to land some version of this. Right now we're not actively working on the LSP and don't have the bandwidth to make design decisions on all those important points you mentioned above, but I hope we will be able to get back to this soon.
I'm happy about the positive reception! I understand there is limited capacity, and am a little busy recently as well. But I am happy to work on this (and potentially other Lean tooling) eventually :).
I think the biggest decision to be made here is whether we want highlighting to be user-extensible. If yes, then all the above considerations just become a matter of converging to some sensible defaults. I refer to this Zulip comment to further highlight (pun intended) the usefulness of user-extensible highlight rules. And I haven't looked into highlighting in the infoview at all yet.
Admittedly I do not fully understand Julian's remarks, as I haven't used checkpoint
before.
In the future, it would be great to have UI support to mark the part we are working on.
I mean, the UI already exists - that's what the margin bar is for.
But I am happy to work on this (and potentially other Lean tooling) eventually :).
Awesome!
I think the biggest decision to be made here is whether we want highlighting to be user-extensible. If yes, then all the above considerations just become a matter of converging to some sensible defaults.
Yes, I consider this RFC to be about the default style. Extensibility is good but could come later. The need to register new kinds in both the server and client/theme makes it a bit more onerous than other extension points.
Improved Semantic Highlighting RFC
The current semantic highlighting doesn't do much:
x.y
are just based a syntactic rule (line 443).TermInfo
node in theInfoTree
and check whether identifiers are variables, which get thevariable
semantic token type.And that's about it.
But Lean is uniquely equipped to do some amazing semantic highlighting, as all the information is already in the
InfoTree
, and you can go intoMetaM
using the same context that terms have been elaborated with. After a bit of playing around, I have arrived at this (very unpolished) proof of concept, just to give an idea of what is possible:I had already created a thread on Zulip, and while it got lots of positive reactions, I didn't get a clear answer on how to proceed, so I hope I'm not doing anything wrong by opening an RFC. The contribution guide says an RFC is required before a PR. This would be my first contribution to Lean itself.
Considerations
Which tokens should get which token type and which token modifiers? Which of the following features should be implemented or not, and which are the priorities?
Four Major Kinds of Tokens
I am not sure which kinds of tokens are the most useful from a UX perspective to distinguish, but so far I have settled on a distinction between types and propositions, and then values vs proofs. This results in four major token kinds, which are determined using
Meta.inferType
twice:expr : ... -> Type _
are types (.type
token modifier).expr : ... -> Prop
are predicates (.prop
).expr : ... : ... -> Type
are values (.value
).expr : ... : ... -> Prop
are theorems (.proof
).Namespaces and Dot Notation
Currently,
Lean.Expr
gets highlighted as one token. I think it would be better ifLean.
was highlighted as a namespace, andExpr
as type. If the "namespace" is actually a type, for example inPerson.name
, thenPerson
should get highlighted as a type, and.name
as the function.I think that given
p : Person
,p.name
should havename
highlighted as a field, not a function. ButPerson.name
should be highlighted as a (projection) function, which it is. Implementing this would be non-trivial, probably requiring a fold over thex1
,x2
, ...,xn
inx1.x2. ... .xn
. It's probably not worth the effort, but would be cool.We often write
myList.length
. In this case, should.length
be highlighted as a function, sinceList.length : List α -> Nat
, or as a non-function, sinceList.length xs : Nat
? I think the latter makes more sense, but would be (much) harder to implement.Tactics
How should tactics get highlighted?
Punctuation
So far only identifiers get highlighted, but Lean is quite notation-heavy. How should the
∀
and,
in∀x, 1 = 1
get highlighted? What about->
,=>
, etc? What about the.
in(. + 2)
vs the.
in(Vec . 3)
?Custom Notation and User Extensibility
Existing notation for some types such as
=
,∀
,∃
,×
etc. do not get highlighted at all currently. I would expect them to get highlighted the same color asEq
,Prod
, etc. I suppose it would be possible to simply hardcode sufficiently many special cases, but that feels wrong.I think it could be useful to be able to highlight tokens based on nearly arbitrary (user-defineable?) rules. This would address the custom notation problem, as well as allow the following: For example, while working on a proof about real numbers, it might be useful to highlight variables
x
for whichx < 0
exists in the context. Or perhaps a typeA
occurs frequently in our theory and means something important, and we want to highlight that, for example highlighting all tasksx : Task Nat
. I imagine this would come in handy when screenshotting code for a presentation or giving a demo, where you want to highlight certain things.I have not done any work on this front, and the extra complexity might not be worth it, or at least not currently.
How to handle
abbrev
?What should
abbrev Vec3f : Type := Vec Nat 3
be highlighted as? I would say the same color asVec
, but without the.func
modifier. What shouldabbrev Vec3 (α : Type) : Type := Vec α 3
be highlighted as? What shouldabbrev Vec3 {α : Type} : Type := Vec α 3
be highlighted as?Universes?
Currently
Type
and0
inType 0
get highlighted differently. I think it would make sense to highlight both andmax
,imax
with the same token kind.Numbers, Strings
It might be possible, due to
OfNat
or coercion or notation, to have0
refer to a function or a type instead of the value. In that case,0
should get highlighted accordingly.Strings currently are only syntactically highlighted using textmate scopes, which causes the
.
ins!"hello {user.name}"
to be highlighted as a string. Semantic token highlight has higher priority than textmate scopes, so we would probably need to make sure that every character inside{..}
gets a semantic token. I have zero experience with textmate scopes, so I'm not sure if it can be fixed there.Fork Default VSCode Theme?
The experience with existing themes should be "good enough", re-using existing LSP token types and modifiers. However, Lean is just different enough that we quickly hit limits. There will be new token types and new token modifiers.
I think the ideal solution is to offer a fork of the VSCode Dark+ and the VSCode Light+ themes and fine-tune them for Lean. This way, when e.g. giving a tutorial on Lean, you can point new users to the theme, which is a simple one-click install, and which can serve as a starting point for users wishing to customize it.
Alternatively, it would be rather easy to create a small token customization generator where users can just tweak some few base colors to make it fit into their existing theme.
The Implementation so far
So far, the declaration kind (whether it's a definition, constructor, inductive type, theorem, recursor, etc.) determines the token type, and its role (whether it's a type, predicate, proof, value, etc.) determines the token modifiers. If we want to re-use existing token types and modifiers as much as possible, it might make sense to introduce a simple internal-only token kinds (?) enum, and then let tokens have a set of these token kinds, which then get mapped to LSP token types and LSP token modifiers. Just for the sake of code cleanliness, as I've found it be a little awkward having to decide whether something is a token type (there can be only one per token!) or a token modifier.
Handling of auxDecls
When we encounter an fvar that is an auxDecl, we check the
Environment
(of the next command) for a constant namedcurrentNamespace ++ fvar.userName
. This feels very hacky, I wonder whether there's a better way available.How LSP Semantic Tokens work
The LSP allows text ranges to have exactly one token type, and any amount of token modifiers, written
tokentype.mod₁.mod₂.mod₃
, which can be easily customized in your VSCodesettings.json
. Token modifiers are encoded as a bit-field, which presumably implicitly imposes some limit (32 modifiers? 64?). For example,variable.declaration.proof
. It is possible and easy to introduce new token types and modifiers, no need to touch the VSCode extension. Just in case some readers don't know.TODOs
where
seem to break currently.option
modifier toname
inset_option name
. Similar forimport X
,open X
.