Open EdAyers opened 2 years ago
@EdAyers Thanks for creating this issue. I love it!
In Mathlib 3, there are tactics such as hint
and library_search
that generate a list of suggestions (of the form Try this: ...
) in the info view. Does it make sense to integrate the "Try this" feature with this one? For example, we could show hints by default in the info view as a form of "default suggestions" for a goal.
In Lean 3, we had support for interactive holes. They were suggested by @david-christiansen, and he actually implemented the Emacs support for this feature. This feature never became popular due to Lean 3 design limitations, and some of these motivations made us start the Lean 4 project. However, I still think "interactive holes" are a great idea. The idea is to click in a hole and get suggestions. Does it make sense to integrate "interactive holes" into your RFC? I would love to have a unified framework for providing suggestions to users.
Thanks again for all your work. It is amazing.
Nice proposal! It's great to see progress on suggestions.
Can you please explain how the location is computed (that the client passes to queryContextualSuggestions
)? This part seems to be completely missing.
How should long-running tactics and cancellations be managed? Suggestion providers should return fast, but the tactics they are suggesting might be long-running. You don't want these triggering and consuming a lot of compute every time the user moves their mouse around.
I agree, it is very distracting if every hover in the infoview has a CPU-intensive side-effect (and ten seconds later blows up the popup with fifty suggestions). Can we put this behind a "suggest" button? We can also add an extra "suggest" button for the whole tactic state.
There's also the question what happens when one provider is fast and another one takes forever. Do we wait for the last one, timeouts?
UX-wise, I'm a bit afraid that we'll get too many suggestions. Almost every tactic can be turned into a suggestion (and will probably be if this proposal works well). If every tactic has multiple suggestions, you quickly get dozens of suggestions. Very useful generic suggestions like "give me a conv state here" would even be a suggestion for every location.
Maybe a two-step approach, where we first show a (very quickly computable) list of at most one suggestion per provider, and then pick the the concrete suggestion in the second step. E.g., first click on rw-suggest
and then a new popup opens where you can pick rw [add_zero]
.
Rather than insert being a string, perhaps have it be a set of text-edit instructions?
Absolutely. We should just reuse whatever LSP uses for completions / code actions.
And display
should probably be a CodeWithInfos
.
loc : (string | number)[];
The proposal doesn't specify how loc
is computed.
structure SuggestionQuery where
pos : Lean.Lsp.TextDocumentPositionParams
loc : Json
It's probably best to have a separate TacticStateLocation
type that encapsulates the whole location business. I don't think there's any need for the position here.
@[suggestion_provider]
I think it would be better to call this @[tacticSuggestionProvider]
.
@gebner
Can you please explain how the location is computed (that the client passes to
queryContextualSuggestions
)? This part seems to be completely missing.
Added to RFC in location details section.
Maybe a two-step approach,
I think this is probably the way to go. I was hoping it would be possible to get all the information in one shot for a better UX, but this is undermined if it takes too long. Rather than wait for all suggestion-providers to complete perhaps they could each be performed in a separate task and incrementally shown as they resolve.
Absolutely. We should just reuse whatever LSP uses for completions / code actions.
I'll look in to this.
structure SuggestionQuery where pos : Lean.Lsp.TextDocumentPositionParams loc : Json
It's probably best to have a separate
TacticStateLocation
type that encapsulates the whole location business. I don't think there's any need for the position here.
I kept the position because the RPC handler needs to recover the Elab.Info
containing the tactic state from the position. It can be bundled though.
I think it would be better to call this
@[tacticSuggestionProvider]
.
Yes some other contenders:@[contextualSuggestionProvider]
or @[locationBasedSuggestionProvider]
or @[interactiveSuggestionProvider]
. There needs to be something distinguishing it from the 'try_this' and hole-based suggestions is my only worry.
@leodemoura
Does it make sense to integrate the "Try this" feature with this one? For example, we could show hints by default in the info view as a form of "default suggestions" for a goal.
I think there are some shared functionality. I will have a look at how try-this and the lightbulbs work. It is a slightly different feature because the suggestion is also dependent on the particular location in the infoview that the user clicked on.
Does it make sense to integrate "interactive holes" into your RFC? I would love to have a unified framework for providing suggestions to users.
Yes @Vtec234 and I have chatted a bit about this. I am also very keen for interactive holes, particularly when turbocharged by syntax-aware document editing by the lean server. I think that a lot of code between this system and interactive holes can be shared.
Can you please explain how the location is computed (that the client passes to
queryContextualSuggestions
)? This part seems to be completely missing.Added to RFC in location details section.
Thanks! So these are four cases and not an arbitrary-length list. In that case I think it would be better to make an inductive for it.
Now I have a new question: how do you specify the tactic state to the queryContextualSuggestions
? Is it "whatever getInteractiveGoals
returns for the position"? There can be more than one tactic state at a position. Tactic states can also be nested in (trace) messages. I'm not expecting the text insertion to be 100% reliable here, but it would nice if the suggestions worked in all tactic states.
I kept the position because the RPC handler needs to recover the
Elab.Info
containing the tactic state from the position. It can be bundled though.
But the position is only relevant for the RPC handler, not the suggestion provider. That's why I suggested to remove it from the provider.
Does it make sense to integrate the "Try this" feature with this one?
Maybe it's helpful to categorize the various kinds of useful code insertion features that we want to have:
squeeze_simp [...]
.library_search
.by show_term { exact ... }
.by library_search
.In Lean 3, "try this" provided (1), and hole commands (triggered by _
) provided (4). This proposal adds (2).
Some of these kinds can be simulated by others, with varying amounts of awkwardness. For example if you have (1), then you get (2) by manually inserting a placeholder tactic first (i.e., you write library_search
and then replace it by the suggestion, instead of clicking on suggest like in this proposal). If you have (1), you also get (4) by writing by library_search
, and (3) by writing a bit more.
However you can't get (1) from (2); so this proposal is not a replacement for "try this". And neither of them give you (5).
UX-wise (1),(3),(5) should also be available as code actions ("light bulbs"). For (2) the UI in this proposal looks great. For (4) I don't think you can get around writing some placeholder, but it would be great if you didn't need the by
as in Lean 3.
But as Ed says, there's certainly a lot of code that can be shared between these code insertion features.
For (4), could the code actions be triggered by an _
that could not be inferred?
Thanks! So these are four cases and not an arbitrary-length list. In that case I think it would be better to make an inductive for it.
Yeah it was originally an inductive but I kept changing my mind about the cases, I'll make it an inductive.
For (4), could the code actions be triggered by an _ that could not be inferred?
Yes, that's how it worked in Lean 3 and that seems like a good approach for Lean 4 as well. Theoretically (4) could also be done like autocompletion (without inserting anything).
Now I have a new question: how do you specify the tactic state to the
queryContextualSuggestions
? Is it "whatevergetInteractiveGoals
returns for the position"? There can be more than one tactic state at a position. Tactic states can also be nested in (trace) messages. I'm not expecting the text insertion to be 100% reliable here, but it would nice if the suggestions worked in all tactic states. ... But the position is only relevant for the RPC handler, not the suggestion provider. That's why I suggested to remove it from the provider.
I think maybe I have not found the correct way of producing a tactic state from an RPC handler. The methods I wrote to do this are here. The way I found was to use the pos to find the snapshot, and then just hope that there is only one ofTacticState at that position (which I think is a bad assumption). If you don't provide the pos in the rpc args I don't know how to get a tactic state in the RPC handler. Perhaps there is a way of passing the tactic state by reference to the infoview?
Perhaps there is a way of passing the tactic state by reference to the infoview?
It is possible to send the InfoWithCtx
storing an .ofTacticInfo
. However if we use the same code that is used in getInteractiveGoals
, by virtue of being deterministic it is guaranteed to retrieve the same goal state as is being displayed by the infoview.
If every tactic has multiple suggestions, you quickly get dozens of suggestions. Very useful generic suggestions like "give me a conv state here" would even be a suggestion for every location.
It might be useful to distinguish tactics that are always useful and ones which are only sometimes so. conv
is a good example of an always-useful one. UI-wise, I thought it might even make sense to add a 🔍 ("zoom") icon to every tooltip where it applies that would generate the conv
command. Others such as rw
or intros
would not make sense with an always-visible icon and are better served by lazily displaying suggestions.
This sounds really cool!
I don't feel like I have enough knowledge of the guts of Lean to contribute too much here, but I do think there's a few areas related to the above discussion that I could draw your attention to.
First off, while GHC and Idris use something like an unsolved _
to represent holes, Agda's notion of a hole is more useful. In Agda, a hole has delimiters on both ends ({!
and !}
IIRC), so users can write things in them. This enables a lot of useful commands - for instance, the editor can diff the hole's expected type against the type found for what the user has current written in it, or the contents can be used as the basis for a "refine" command that inserts fresh holes in as many argument positions as necessary to make things work, or autocomplete can be used to write things that are clearly not yet well-typed in the current context. This was the kind of thing that we were trying to make user-extensible back in Lean 3 during my visit to MSR. A really nice thing about Agda-style delimited holes is that they explicitly indicate to the language implementation that a piece of code is not yet done, and they explicitly delimit the boundary between the completed and presumed-correct parts and the parts that are undergoing exploratory work, allowing tools to use this information.
Since Agda was built, Cyrus Omar (@cyrus-) and his collaborators have also done the work to figure out reasonable operational semantics for holes, so it becomes possible to run a program with holes in it and use the resulting value as input into the process of figuring out what belongs in them. In their work, the value of a hole is a closure that gives the values of all free variables when that specific hole was evaluated. From an NbE perspective, I'm not sure how to quote those back to syntax, but I suspect that something elegant could be done.
Secondly, Joomy Korkut's M.Sc. thesis is related work for anyone wanting to add metaprogrammable code actions: http://cattheory.com/editTimeTacticsDraft.pdf (we have a TyDe paper based on it here: http://cattheory.com/extensibleTypeDirectedEditing.pdf ).
Finally, Lean has a really nice system of quoted syntax with appropriate hygiene. Why would the insertion be a string, or a collection of text editing instructions, when it could be syntax? As a Lean metaprogrammer, I think in terms of syntax at all times. I can think of two challenges. First, there's no standard formatter for Lean, so the formatting of the inserted syntax might be bogus. Secondly, variable capture would have to be handled in a smart way, but I think that the hygiene info could be used to fix up binder names automatically, which would be a huge win.
BTW, the current SubExpr.Pos
JSON encoding won't roundtrip through the vscode LSP library. (Since the default javascript JSON parser converts every number to a float.) From what I can tell it will break for expressions that are nested more than 26 levels deep.
@gebner
the current SubExpr.Pos JSON encoding won't roundtrip through the vscode
that's a good spot, I'm currently in the process of writing a library for SubExpr.Pos
, I can change this so that toJson
is a string?
@david-christiansen
I really really like these references.
Having {! !}
or similar in Lean 4 would be awesome.
Re: syntax vs inserting text. Yes the goal is absolutely to do it in terms of syntax, and then write a function diff : Syntax → Syntax → TextEdit
.
The main thing that I've got from the conversation so far is that this RFC should be dependent on a more general RFC for how syntax-aware code-actions should work generally in Lean 4. In particular there seems to be a lot of potential to do something really cool like in the citations that David referenced.
I mean, there's always the issue with a project like this where the discussion of the giant dream gets in the way of building useful stuff in the here and now. On the one hand, a coherent vision helps guide designs so that they work together, but on the other hand, requiring a coherent vision can lead to a painful pile of bikeshedding that blocks progress. I don't know where to put the balance here, but I trust you all to do so :)
Agda's holes go back to Epigram's notion of the "shed", by the way - in case you want to do a full search for related work. See "The View from the Left" by McBride and McKinna (two versions here: http://strictlypositive.org/publications.html) for context on Epigram, and https://cs.ru.nl/~freek/courses/tt-2010/tvftl/epigram-notes.pdf for a better idea of what it was like.
I tend to see Epigram as something like an artistic manifesto rather than a useful proof/programming tool, but it's influence has been wide. Lean has made practical many of the things that were presented in Epigram, and it would be cool to see more!
I'm dubious about {! !}
, because we had that in lean 3 and it essentially didn't get used at all. I think it needs a proper retrospective to figure out what the issues were and how we can solve them in lean 4.
I was hoping that some uses of {! !}
could be replaced by code actions on _
that prompt for the additional data (which may at first sound like something that only feels at home in VS Code, but Emacs is full of prompts as well), but it doesn't look like LSP will support anything like that anytime soon: https://github.com/microsoft/language-server-protocol/issues/1164
However, I found the following remark from that thread quite interesting:
Quite a few of the refactors can be handled this way. For example, IntelliJ recently introduces in-place "change signature" (refactor which allows you to change types, names, order and quantity of function's parameters, and updates all the call sites): https://blog.jetbrains.com/idea/2020/02/intellij-idea-2020-1-eap4/.
The basic idea is that we just allow the user to change the signature of a method in an IDE, without invoking refactor explicitly, and then supply a code action to fix the call-sites.
The same works for Java's analogue of move between files -- if you select some code, cut it in one file, and paste it into the other, the IDE prompts you with yes/now dialog for updating call-sites.
While that may not cover quite the same use cases David had in mind, this one at least we could reasonably support without further protocol extensions. We'd "just" have to store and diff syntax trees in the server and produce appropriate code actions from that.
@digama0
I'm dubious about
{! !}
, because we had that in lean 3 and it essentially didn't get used at all. I think it needs a proper retrospective to figure out what the issues were and how we can solve them in lean 4.
The {! !}
was underdeveloped in Lean 3, and it exposed important limitations in its design: no Syntax
object, and pretty-printer only works for fully elaborated terms. I recall we have tried to add a hole action that would insert a match
expression. This was quite messy because
We can't express match
expressions using kernel terms.
The Lean 3 parser uses pre-terms (which are just the kernel expr
s with a bunch of annotations). Syntax such as match
expressions is encoded in ad-hoc ways, and we can't use the pretty printer on them.
We thought about using strings, but this is also a horrible solution.
The frustrations with the hole-actions in Lean 3 contributed to the decision of starting Lean 4.
In Lean 4, we now have a much more flexible design, Syntax
objects, parenthesizers/formatters, etc.
For what it's worth, rust-analyzer
has a much more efficient way of introduce match
compared to typing {! x !}
:
https://rust-analyzer.github.io/manual.html#magic-completions
I want that.
For what it's worth,
rust-analyzer
has a much more efficient way of introducematch
compared to typing{! x !}
:
Here are some previous discussions we've had on abusing autocompletion:
I'm decidedly not a fan of having autocomplete insert syntax snippets. This feature has annoyed me greatly since I first encountered it years (decades?) ago in IntelliJ.
To offer a more constructive suggestion: we could offer a code action on match x with
that inserts the missing cases. (As a bonus, this also helps with refactoring when you add a new case to an inductive type.)
I don't think it's a good idea to invent new nonfunctional syntax just to access code actions. I already know match x with
, why should I learn a new syntax like x.match
or {! x !}
?
Yeah, we should definitely have it on with
, and then maybe an opt-in for magic completions :)
Sorry to continue the sidetracking of the thread, @EdAyers.
I don't see holes as primarily being about quickly getting completion to make it faster to do things like building match expressions - for that, things like @Kha's demo or even good old yasnippet are fine. Instead, I see them as something I can use to delimit the border between "code I believe in" and "code that I don't yet believe in". Then, the system knows about this border, and can offer help (e.g. diffing the types, suggesting transformations, providing a todo list, etc).
The {! ... !}
syntax is a bit annoying to type, though. In Agda, ?
left behind in a buffer gets transformed by the system into {! !}
, which works well for that language. I suspect something similar could work in Lean, though without the "explicitly ask for type checking" step that Agda makes users do. Perhaps just as an abbreviation in editors that's similar to the various Unicode sequences. Maybe expand \?
or something. A code action to turn an expression with a type error into a filled hole would be useful as well, so one could swap 3 + "three"
for 3 + {! "three" !}
with a quick click and then get Lean to say "This hole is supposed to be a Nat, but it's got a String in it". This is, of course, more interesting when the types get more interesting.
Holes are really a tool for programming, rather than for proving. I see the fundamental difference as being that, when proving, I care that I get an inhabitant of a type, whereas when programming, I care which inhabitant I end up with. Tactics are great for the former, but for the latter, they're pretty risky. I find holes to be very useful for programming even in GHC, where they're a bit less developed than in Idris or Agda.
Circling back to the original question of this thread: Lean is pretty great today at explaining incomplete proof states, and providing more help in them is great! It would be neat if similar things worked for incomplete program states, but the two approaches seem at least somewhat orthogonal to me.
RFC: contextual suggestions
People who are probably interested: @KaseQuark @Kha @javra @Vtec234 @leodemoura @gebner
Links
Goal
The purpose of contextual suggestions is to add some interactivity to the infoview's tactic state. The idea is that if you click on a particular subexpression in the target type of a goal, Lean should suggest rewrites or simplification tactics or conv tactics that could be applied at that subexpression. Similarly if the user clicks on a hypothesis, and the hypothesis is an inductive datatype, Lean should suggest
cases
orinduction
on that type. Another, more ambitious example is for lean to suggest lemmas that could beapply
ed depending on the subexpression you clicked on according to some heuristics.A potential extension to this feature to consider in the future is where Lean will also return the interactive goal state after the suggestion has been applied. Perhaps rendered by the infoview as a diff between the current goal state and the new one.
Proposal
In order to develop this, the tactic state infoview javascript code needs a way of telling the Lean server where the user clicked in the interactive tactic state.
Workflow
5
on the main goal's target.Lean.Widget.queryContextualSuggestions
RPC method with the argument being[goal.mvarid, "type", 5]
, representing the location that the user clicked.SuggestionProvider
s (registered using an attribute@[suggestion_provider]
). The proposed definition ofSuggestionProvider
is given below.Suggestion
object and then runs them on the tactic state, producing a new set ofInteractiveGoals
that can be used to make a diff for the user.Lean.Widget.queryContextualSuggestions
with the set of succeeding suggestions.diff : Syntax → Syntax → TextEdit
that determines a more precise edit (eg so you could add new rewrite lemmas to )Potential suggestion providers
Congruence tactics
If you click on a subexpression, what are the rewrite rules in mathlib that can be applied at that point? This should be fairly efficient to calculate because the rewrite lemmas can be stored in a discrimination tree, and we can restrict to rewrites that only hit the head expression.
In general any tactic that works inside
conv
will be available with this method. @KaseQuark is working on this.Other, more heavy suggestions could involve
simp
,ring
,norm_num
, e-matching etc. In this case, performance would become an issue and we will need to start thinking about concurrency and having cancellable versions of these tactic.other tactics
I think that the suggestions api has some potential in teaching new users which tactics to use. Most of these could be implemented as just a flat list of suggestions for the goal, but clicking in the target or hypothesis that is relevent for the tactic can help narrow the list of suggestions that the new user has to choose from.
For example,
if they click on the goal and it is a pi-binder, it can suggest
intros
. The number of items that are intro'd could depend on exactly where they click in the expression.cases
,rcases
clicking on an inductive datatypeapply
, clicking on a hypothesis that matches with the goal, clicking on a subexpression of the goal and doing a discrimination tree search of lemmas that match with the subexpression the user clicked on. (perhaps with some heuristics to keep the results list small)rfl
if the goal can be solved with rfl.Specialised use cases
Since suggestions are added with the
@[suggestion_provider]
attribute. Users can add their own special suggestions based on new, future tactics. Eg perhaps one for zooming in on a subexpression in an inequality proof using monotonicity lemmas.Detail on locations
To specify a position in the interactive goal, rather than having a special-purpose type I propse I having an ad-hoc approach, where a location is represented as an array of numbers and strings. This has the advantage of keeping the types passed through the RPC as simple as possible, with the disadvantage being that we lose some type safety.
In the current prototype the location has the following format:
[mvarid, "type", subexprPos]
[mvarid, "hyps", fvarids[0], "type"]
(the hypothesis bundle may have multiple fvars in it, in which case we just assume the suggestion is for the first fvar in the bundle)[mvarid, "hyps", fvarids[0], "type", subexprPos]
[mvarid, "hyps", fvarids[0], "value", subexprPos]
[mvarid, "hyps", fvarid]
These are stored as a React context (eg see here and consumed here) that is passed to various parts of the interactive goal view UI. So the computed location is decided entirely by vscode. Another approach would be to have the location passed directly from Lean server as an identifier for each
InteractiveGoal
andInteractiveHypothesisBundle
object.Tasks
insert
being a string, perhaps have it be a set of text-edit instructions?