EdAyers commented 2 years ago

RFC: contextual suggestions

People who are probably interested: @KaseQuark @Kha @javra @Vtec234 @leodemoura @gebner

Links

Goal

The purpose of contextual suggestions is to add some interactivity to the infoview's tactic state. The idea is that if you click on a particular subexpression in the target type of a goal, Lean should suggest rewrites or simplification tactics or conv tactics that could be applied at that subexpression. Similarly if the user clicks on a hypothesis, and the hypothesis is an inductive datatype, Lean should suggest cases or induction on that type. Another, more ambitious example is for lean to suggest lemmas that could be applyed depending on the subexpression you clicked on according to some heuristics.

A potential extension to this feature to consider in the future is where Lean will also return the interactive goal state after the suggestion has been applied. Perhaps rendered by the infoview as a diff between the current goal state and the new one.

Proposal

In order to develop this, the tactic state infoview javascript code needs a way of telling the Lean server where the user clicked in the interactive tactic state.

Workflow

User clicks in a tactic script
Infoview renders the tactic state as usual
User clicks somewhere within the interactive goals, let's say at position 5 on the main goal's target.
A popup appears above the subexpression, showing the type and the explicit arguments and maybe a docstring. So far this is normal operation.
New bit: the type popup React component also calls the Lean.Widget.queryContextualSuggestions RPC method with the argument being [goal.mvarid, "type", 5], representing the location that the user clicked.
Now Lean takes this information and runs a set of SuggestionProviders (registered using an attribute @[suggestion_provider]). The proposed definition of SuggestionProvider is given below.
The Lean server runs each suggestion provider with the tactic state set to be the tactic state used to render the interactive goals.
(optional) the Lean server then elaborates the tactics in the Suggestion object and then runs them on the tactic state, producing a new set of InteractiveGoals that can be used to make a diff for the user.
The server returns the Lean.Widget.queryContextualSuggestions with the set of succeeding suggestions.
The infoview then appends these suggestions to the type popup
(optional) hovering over a suggestion will show the diff / new goal state
clicking a suggestion will insert the suggestion in to the document. In the prototype this works by simply inserting a new line, however it might be interesting to have a function diff : Syntax → Syntax → TextEdit that determines a more precise edit (eg so you could add new rewrite lemmas to )

structure SuggestionQuery where
  pos : Lean.Lsp.TextDocumentPositionParams
  loc : Json

structure Suggestion where
  /-- What the user sees -/
  description : String
  /-- What should be inserted -/
  insert : Syntax

SuggestionProvider : SuggestionQuery → TacticM (List Suggestion)

Potential suggestion providers

Congruence tactics

If you click on a subexpression, what are the rewrite rules in mathlib that can be applied at that point? This should be fairly efficient to calculate because the rewrite lemmas can be stored in a discrimination tree, and we can restrict to rewrites that only hit the head expression.

In general any tactic that works inside conv will be available with this method. @KaseQuark is working on this.

Other, more heavy suggestions could involve simp, ring, norm_num, e-matching etc. In this case, performance would become an issue and we will need to start thinking about concurrency and having cancellable versions of these tactic.

other tactics

I think that the suggestions api has some potential in teaching new users which tactics to use. Most of these could be implemented as just a flat list of suggestions for the goal, but clicking in the target or hypothesis that is relevent for the tactic can help narrow the list of suggestions that the new user has to choose from.

For example,

if they click on the goal and it is a pi-binder, it can suggest intros. The number of items that are intro'd could depend on exactly where they click in the expression.
cases, rcases clicking on an inductive datatype
apply, clicking on a hypothesis that matches with the goal, clicking on a subexpression of the goal and doing a discrimination tree search of lemmas that match with the subexpression the user clicked on. (perhaps with some heuristics to keep the results list small)
rfl if the goal can be solved with rfl.

Specialised use cases

Since suggestions are added with the @[suggestion_provider] attribute. Users can add their own special suggestions based on new, future tactics. Eg perhaps one for zooming in on a subexpression in an inequality proof using monotonicity lemmas.

Detail on locations

To specify a position in the interactive goal, rather than having a special-purpose type I propse I having an ad-hoc approach, where a location is represented as an array of numbers and strings. This has the advantage of keeping the types passed through the RPC as simple as possible, with the disadvantage being that we lose some type safety.

interface ContextualSuggestionQueryRequest {
    pos : TextDocumentPositionParams;
    loc : (string | number)[];
}

interface Suggestion {
    /** The text to show to the user in the suggestion viewer. */
    display : string
    /** The text that should be inserted on the above line. */
    insert : string
}

interface ContextualSuggestionQueryResponse{
    completions: Suggestion[]
}

In the current prototype the location has the following format:

goal type [mvarid, "type", subexprPos]
hypothesis type [mvarid, "hyps", fvarids[0], "type"] (the hypothesis bundle may have multiple fvars in it, in which case we just assume the suggestion is for the first fvar in the bundle)
hypothesis type [mvarid, "hyps", fvarids[0], "type", subexprPos]
hypothesis value [mvarid, "hyps", fvarids[0], "value", subexprPos]
hypothesis name [mvarid, "hyps", fvarid]

These are stored as a React context (eg see here and consumed here) that is passed to various parts of the interactive goal view UI. So the computed location is decided entirely by vscode. Another approach would be to have the location passed directly from Lean server as an identifier for each InteractiveGoal and InteractiveHypothesisBundle object.

Tasks

[ ] Rather than insert being a string, perhaps have it be a set of text-edit instructions?
[ ] How should long-running tactics and cancellations be managed? Suggestion providers should return fast, but the tactics they are suggesting might be long-running. You don't want these triggering and consuming a lot of compute every time the user moves their mouse around.
[ ] Is too clunky?

leodemoura commented 2 years ago

@EdAyers Thanks for creating this issue. I love it!

In Mathlib 3, there are tactics such as hint and library_search that generate a list of suggestions (of the form Try this: ...) in the info view. Does it make sense to integrate the "Try this" feature with this one? For example, we could show hints by default in the info view as a form of "default suggestions" for a goal.

In Lean 3, we had support for interactive holes. They were suggested by @david-christiansen, and he actually implemented the Emacs support for this feature. This feature never became popular due to Lean 3 design limitations, and some of these motivations made us start the Lean 4 project. However, I still think "interactive holes" are a great idea. The idea is to click in a hole and get suggestions. Does it make sense to integrate "interactive holes" into your RFC? I would love to have a unified framework for providing suggestions to users.

Thanks again for all your work. It is amazing.

gebner commented 2 years ago

Nice proposal! It's great to see progress on suggestions.

Can you please explain how the location is computed (that the client passes to queryContextualSuggestions)? This part seems to be completely missing.

How should long-running tactics and cancellations be managed? Suggestion providers should return fast, but the tactics they are suggesting might be long-running. You don't want these triggering and consuming a lot of compute every time the user moves their mouse around.

I agree, it is very distracting if every hover in the infoview has a CPU-intensive side-effect (and ten seconds later blows up the popup with fifty suggestions). Can we put this behind a "suggest" button? We can also add an extra "suggest" button for the whole tactic state.

There's also the question what happens when one provider is fast and another one takes forever. Do we wait for the last one, timeouts?

UX-wise, I'm a bit afraid that we'll get too many suggestions. Almost every tactic can be turned into a suggestion (and will probably be if this proposal works well). If every tactic has multiple suggestions, you quickly get dozens of suggestions. Very useful generic suggestions like "give me a conv state here" would even be a suggestion for every location.

Maybe a two-step approach, where we first show a (very quickly computable) list of at most one suggestion per provider, and then pick the the concrete suggestion in the second step. E.g., first click on rw-suggest and then a new popup opens where you can pick rw [add_zero].

Rather than insert being a string, perhaps have it be a set of text-edit instructions?

Absolutely. We should just reuse whatever LSP uses for completions / code actions.

And display should probably be a CodeWithInfos.

loc : (string | number)[];

The proposal doesn't specify how loc is computed.

structure SuggestionQuery where
  pos : Lean.Lsp.TextDocumentPositionParams
  loc : Json

It's probably best to have a separate TacticStateLocation type that encapsulates the whole location business. I don't think there's any need for the position here.

@[suggestion_provider]

I think it would be better to call this @[tacticSuggestionProvider].

EdAyers commented 2 years ago

@gebner

Can you please explain how the location is computed (that the client passes to queryContextualSuggestions)? This part seems to be completely missing.

Added to RFC in location details section.

Maybe a two-step approach,

I think this is probably the way to go. I was hoping it would be possible to get all the information in one shot for a better UX, but this is undermined if it takes too long. Rather than wait for all suggestion-providers to complete perhaps they could each be performed in a separate task and incrementally shown as they resolve.

Absolutely. We should just reuse whatever LSP uses for completions / code actions.

I'll look in to this.

structure SuggestionQuery where
  pos : Lean.Lsp.TextDocumentPositionParams
  loc : Json
It's probably best to have a separate TacticStateLocation type that encapsulates the whole location business. I don't think there's any need for the position here.

I kept the position because the RPC handler needs to recover the Elab.Info containing the tactic state from the position. It can be bundled though.

I think it would be better to call this @[tacticSuggestionProvider].

Yes some other contenders:@[contextualSuggestionProvider] or @[locationBasedSuggestionProvider] or @[interactiveSuggestionProvider]. There needs to be something distinguishing it from the 'try_this' and hole-based suggestions is my only worry.

EdAyers commented 2 years ago

@leodemoura

Does it make sense to integrate the "Try this" feature with this one? For example, we could show hints by default in the info view as a form of "default suggestions" for a goal.

I think there are some shared functionality. I will have a look at how try-this and the lightbulbs work. It is a slightly different feature because the suggestion is also dependent on the particular location in the infoview that the user clicked on.

Does it make sense to integrate "interactive holes" into your RFC? I would love to have a unified framework for providing suggestions to users.

Yes @Vtec234 and I have chatted a bit about this. I am also very keen for interactive holes, particularly when turbocharged by syntax-aware document editing by the lean server. I think that a lot of code between this system and interactive holes can be shared.

gebner commented 2 years ago

Can you please explain how the location is computed (that the client passes to queryContextualSuggestions)? This part seems to be completely missing.

Added to RFC in location details section.

Thanks! So these are four cases and not an arbitrary-length list. In that case I think it would be better to make an inductive for it.

Now I have a new question: how do you specify the tactic state to the queryContextualSuggestions? Is it "whatever getInteractiveGoals returns for the position"? There can be more than one tactic state at a position. Tactic states can also be nested in (trace) messages. I'm not expecting the text insertion to be 100% reliable here, but it would nice if the suggestions worked in all tactic states.

I kept the position because the RPC handler needs to recover the Elab.Info containing the tactic state from the position. It can be bundled though.

But the position is only relevant for the RPC handler, not the suggestion provider. That's why I suggested to remove it from the provider.

gebner commented 2 years ago

Does it make sense to integrate the "Try this" feature with this one?

Maybe it's helpful to categorize the various kinds of useful code insertion features that we want to have:

Replace a tactic by another tactic. For example: squeeze_simp [...].
Insert a tactic. For example: library_search.
Replace an expression by another expression. For example: by show_term { exact ... }.
Insert an expression. For example: insert structure instance stub, by library_search.
Replace (non-expression) syntax by another syntax. For example: unfold macro.

In Lean 3, "try this" provided (1), and hole commands (triggered by _) provided (4). This proposal adds (2).

Some of these kinds can be simulated by others, with varying amounts of awkwardness. For example if you have (1), then you get (2) by manually inserting a placeholder tactic first (i.e., you write library_search and then replace it by the suggestion, instead of clicking on suggest like in this proposal). If you have (1), you also get (4) by writing by library_search, and (3) by writing a bit more.

However you can't get (1) from (2); so this proposal is not a replacement for "try this". And neither of them give you (5).

UX-wise (1),(3),(5) should also be available as code actions ("light bulbs"). For (2) the UI in this proposal looks great. For (4) I don't think you can get around writing some placeholder, but it would be great if you didn't need the by as in Lean 3.

But as Ed says, there's certainly a lot of code that can be shared between these code insertion features.

EdAyers commented 2 years ago

For (4), could the code actions be triggered by an _ that could not be inferred?

EdAyers commented 2 years ago

Thanks! So these are four cases and not an arbitrary-length list. In that case I think it would be better to make an inductive for it.

Yeah it was originally an inductive but I kept changing my mind about the cases, I'll make it an inductive.

gebner commented 2 years ago

For (4), could the code actions be triggered by an _ that could not be inferred?

Yes, that's how it worked in Lean 3 and that seems like a good approach for Lean 4 as well. Theoretically (4) could also be done like autocompletion (without inserting anything).

EdAyers commented 2 years ago

Now I have a new question: how do you specify the tactic state to the queryContextualSuggestions? Is it "whatever getInteractiveGoals returns for the position"? There can be more than one tactic state at a position. Tactic states can also be nested in (trace) messages. I'm not expecting the text insertion to be 100% reliable here, but it would nice if the suggestions worked in all tactic states. ... But the position is only relevant for the RPC handler, not the suggestion provider. That's why I suggested to remove it from the provider.

I think maybe I have not found the correct way of producing a tactic state from an RPC handler. The methods I wrote to do this are here. The way I found was to use the pos to find the snapshot, and then just hope that there is only one ofTacticState at that position (which I think is a bad assumption). If you don't provide the pos in the rpc args I don't know how to get a tactic state in the RPC handler. Perhaps there is a way of passing the tactic state by reference to the infoview?

Vtec234 commented 2 years ago

Perhaps there is a way of passing the tactic state by reference to the infoview?

It is possible to send the InfoWithCtx storing an .ofTacticInfo. However if we use the same code that is used in getInteractiveGoals, by virtue of being deterministic it is guaranteed to retrieve the same goal state as is being displayed by the infoview.

If every tactic has multiple suggestions, you quickly get dozens of suggestions. Very useful generic suggestions like "give me a conv state here" would even be a suggestion for every location.

It might be useful to distinguish tactics that are always useful and ones which are only sometimes so. conv is a good example of an always-useful one. UI-wise, I thought it might even make sense to add a 🔍 ("zoom") icon to every tooltip where it applies that would generate the conv command. Others such as rw or intros would not make sense with an always-visible icon and are better served by lazily displaying suggestions.

david-christiansen commented 2 years ago

This sounds really cool!

I don't feel like I have enough knowledge of the guts of Lean to contribute too much here, but I do think there's a few areas related to the above discussion that I could draw your attention to.

First off, while GHC and Idris use something like an unsolved _ to represent holes, Agda's notion of a hole is more useful. In Agda, a hole has delimiters on both ends ({! and !} IIRC), so users can write things in them. This enables a lot of useful commands - for instance, the editor can diff the hole's expected type against the type found for what the user has current written in it, or the contents can be used as the basis for a "refine" command that inserts fresh holes in as many argument positions as necessary to make things work, or autocomplete can be used to write things that are clearly not yet well-typed in the current context. This was the kind of thing that we were trying to make user-extensible back in Lean 3 during my visit to MSR. A really nice thing about Agda-style delimited holes is that they explicitly indicate to the language implementation that a piece of code is not yet done, and they explicitly delimit the boundary between the completed and presumed-correct parts and the parts that are undergoing exploratory work, allowing tools to use this information.

Since Agda was built, Cyrus Omar (@cyrus-) and his collaborators have also done the work to figure out reasonable operational semantics for holes, so it becomes possible to run a program with holes in it and use the resulting value as input into the process of figuring out what belongs in them. In their work, the value of a hole is a closure that gives the values of all free variables when that specific hole was evaluated. From an NbE perspective, I'm not sure how to quote those back to syntax, but I suspect that something elegant could be done.

Secondly, Joomy Korkut's M.Sc. thesis is related work for anyone wanting to add metaprogrammable code actions: http://cattheory.com/editTimeTacticsDraft.pdf (we have a TyDe paper based on it here: http://cattheory.com/extensibleTypeDirectedEditing.pdf ).

Finally, Lean has a really nice system of quoted syntax with appropriate hygiene. Why would the insertion be a string, or a collection of text editing instructions, when it could be syntax? As a Lean metaprogrammer, I think in terms of syntax at all times. I can think of two challenges. First, there's no standard formatter for Lean, so the formatting of the inserted syntax might be bogus. Secondly, variable capture would have to be handled in a smart way, but I think that the hygiene info could be used to fix up binder names automatically, which would be a huge win.

gebner commented 2 years ago

BTW, the current SubExpr.Pos JSON encoding won't roundtrip through the vscode LSP library. (Since the default javascript JSON parser converts every number to a float.) From what I can tell it will break for expressions that are nested more than 26 levels deep.

EdAyers commented 2 years ago

@gebner

the current SubExpr.Pos JSON encoding won't roundtrip through the vscode

that's a good spot, I'm currently in the process of writing a library for SubExpr.Pos, I can change this so that toJson is a string?

PR: https://github.com/leanprover/lean4/pull/1208

EdAyers commented 2 years ago

@david-christiansen I really really like these references. Having {! !} or similar in Lean 4 would be awesome.

Re: syntax vs inserting text. Yes the goal is absolutely to do it in terms of syntax, and then write a function diff : Syntax → Syntax → TextEdit.

The main thing that I've got from the conversation so far is that this RFC should be dependent on a more general RFC for how syntax-aware code-actions should work generally in Lean 4. In particular there seems to be a lot of potential to do something really cool like in the citations that David referenced.

david-christiansen commented 2 years ago

I mean, there's always the issue with a project like this where the discussion of the giant dream gets in the way of building useful stuff in the here and now. On the one hand, a coherent vision helps guide designs so that they work together, but on the other hand, requiring a coherent vision can lead to a painful pile of bikeshedding that blocks progress. I don't know where to put the balance here, but I trust you all to do so :)

david-christiansen commented 2 years ago

Agda's holes go back to Epigram's notion of the "shed", by the way - in case you want to do a full search for related work. See "The View from the Left" by McBride and McKinna (two versions here: http://strictlypositive.org/publications.html) for context on Epigram, and https://cs.ru.nl/~freek/courses/tt-2010/tvftl/epigram-notes.pdf for a better idea of what it was like.

I tend to see Epigram as something like an artistic manifesto rather than a useful proof/programming tool, but it's influence has been wide. Lean has made practical many of the things that were presented in Epigram, and it would be cool to see more!

digama0 commented 2 years ago

I'm dubious about {! !}, because we had that in lean 3 and it essentially didn't get used at all. I think it needs a proper retrospective to figure out what the issues were and how we can solve them in lean 4.

Kha commented 2 years ago

I was hoping that some uses of {! !} could be replaced by code actions on _ that prompt for the additional data (which may at first sound like something that only feels at home in VS Code, but Emacs is full of prompts as well), but it doesn't look like LSP will support anything like that anytime soon: https://github.com/microsoft/language-server-protocol/issues/1164

However, I found the following remark from that thread quite interesting:

Quite a few of the refactors can be handled this way. For example, IntelliJ recently introduces in-place "change signature" (refactor which allows you to change types, names, order and quantity of function's parameters, and updates all the call sites): https://blog.jetbrains.com/idea/2020/02/intellij-idea-2020-1-eap4/.

The basic idea is that we just allow the user to change the signature of a method in an IDE, without invoking refactor explicitly, and then supply a code action to fix the call-sites.

The same works for Java's analogue of move between files -- if you select some code, cut it in one file, and paste it into the other, the IDE prompts you with yes/now dialog for updating call-sites.

While that may not cover quite the same use cases David had in mind, this one at least we could reasonably support without further protocol extensions. We'd "just" have to store and diff syntax trees in the server and produce appropriate code actions from that.

leodemoura commented 2 years ago

@digama0

I'm dubious about {! !}, because we had that in lean 3 and it essentially didn't get used at all. I think it needs a proper retrospective to figure out what the issues were and how we can solve them in lean 4.

The {! !} was underdeveloped in Lean 3, and it exposed important limitations in its design: no Syntax object, and pretty-printer only works for fully elaborated terms. I recall we have tried to add a hole action that would insert a match expression. This was quite messy because

We can't express match expressions using kernel terms.
The Lean 3 parser uses pre-terms (which are just the kernel exprs with a bunch of annotations). Syntax such as match expressions is encoded in ad-hoc ways, and we can't use the pretty printer on them.
We thought about using strings, but this is also a horrible solution.

The frustrations with the hole-actions in Lean 3 contributed to the decision of starting Lean 4. In Lean 4, we now have a much more flexible design, Syntax objects, parenthesizers/formatters, etc.

Kha commented 2 years ago

For what it's worth, rust-analyzer has a much more efficient way of introduce match compared to typing {! x !}: https://rust-analyzer.github.io/manual.html#magic-completions I want that.

gebner commented 2 years ago

For what it's worth, rust-analyzer has a much more efficient way of introduce match compared to typing {! x !}:

Here are some previous discussions we've had on abusing autocompletion:

I'm decidedly not a fan of having autocomplete insert syntax snippets. This feature has annoyed me greatly since I first encountered it years (decades?) ago in IntelliJ.

To offer a more constructive suggestion: we could offer a code action on match x with that inserts the missing cases. (As a bonus, this also helps with refactoring when you add a new case to an inductive type.)

I don't think it's a good idea to invent new nonfunctional syntax just to access code actions. I already know match x with, why should I learn a new syntax like x.match or {! x !}?

Kha commented 2 years ago

Yeah, we should definitely have it on with, and then maybe an opt-in for magic completions :)

david-christiansen commented 2 years ago

Sorry to continue the sidetracking of the thread, @EdAyers.

I don't see holes as primarily being about quickly getting completion to make it faster to do things like building match expressions - for that, things like @Kha's demo or even good old yasnippet are fine. Instead, I see them as something I can use to delimit the border between "code I believe in" and "code that I don't yet believe in". Then, the system knows about this border, and can offer help (e.g. diffing the types, suggesting transformations, providing a todo list, etc).

The {! ... !} syntax is a bit annoying to type, though. In Agda, ? left behind in a buffer gets transformed by the system into {! !}, which works well for that language. I suspect something similar could work in Lean, though without the "explicitly ask for type checking" step that Agda makes users do. Perhaps just as an abbreviation in editors that's similar to the various Unicode sequences. Maybe expand \? or something. A code action to turn an expression with a type error into a filled hole would be useful as well, so one could swap 3 + "three" for 3 + {! "three" !} with a quick click and then get Lean to say "This hole is supposed to be a Nat, but it's got a String in it". This is, of course, more interesting when the types get more interesting.

Holes are really a tool for programming, rather than for proving. I see the fundamental difference as being that, when proving, I care that I get an inhabitant of a type, whereas when programming, I care which inhabitant I end up with. Tactics are great for the former, but for the latter, they're pretty risky. I find holes to be very useful for programming even in GHC, where they're a bit less developed than in Idris or Agda.

Circling back to the original question of this thread: Lean is pretty great today at explaining incomplete proof states, and providing more help in them is great! It would be neat if similar things worked for incomplete program states, but the two approaches seem at least somewhat orthogonal to me.

leanprover / lean4