Closed siddhartha-gadgil closed 1 year ago
The relevant pull request is https://github.com/leanprover/lean4/pull/1661
I missed the fact that https://github.com/leanprover/lean4/pull/1661 has been merged four weeks ago. We should look at this carefully and design our interface to use it.
Note that lazy code actions are part of the API. So we do not have to worry about tirggering while editing.
It is really nice that lazy actions are supported by default. I'll take a look at the PR and try a few experiments imitating my lean3
code on the lake-update
branch.
I have been trying various experiments with the test file "codeactions.lean" and asking Ed Ayers things. It looks like what we get easily is a position in a file. We can then read the file and work with it.
Given this, a good strategy may be to have a code action that picks up a comment before the present position (though after may be easier in terms of parsing). This can be pretty convenient to a user.
I mean that we have no special syntax. When the code action is called just read, parse looking for a comment preceding the position and translate it. Following may be easier to parse directly, but I see that comment markers are palindromic so we can reverse.
We will also need to account for the different kinds of comments /-- -/
, /-! -/
, /- -/
(so they may not always be perfectly palindromic). I will take a look at how the InfoTree
stores the information about comments.
What I meant was that we can try to:
You are correct that we have to deal with stuff that does not look fully like comments. But once we are alert to this it is easy.
But InfoTree
may be more principled.
The question is do you know how to get the InfoTree from within a document?
Okay, I see a getInfoTrees
function. But it still seems non-trivial at IO level to get the InfoTree and work with it. Working with just the text may be easier.
I am writing as I think, but here is my latest attempt at a simple implementation:
-/
, otherwise do nothing (perhaps with a warning)./-
, /--
, /-!
- if all are absent do nothing.No, I have not yet figured out how to operate with the InfoTree
. I agree, the string parsing option seems simpler - I will try that one first.
Stuff in the following thread may be useful: https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/Syntax.20pos.20to.20LSP.20pos/near/310623959
Thanks, I will take a look. So far I have managed to extend the range
to the entire line (not just the position where the hole command was invoked).
We don't want holes. One should just invoke a code action. I am asking on Zulip how to get the file.
Ed Ayers has a solution to build on:
@[codeActionProvider]
def xyz : CodeActionProvider := fun params snap => do
let rc : RequestContext ← read
let ca : CodeAction := {
title := "xyz"
}
let lazy : RequestM CodeAction := do
-- this is a requestM
let doc ← RequestM.readDoc
return ca
return #[{
eager := ca
lazy? := some <| (EIO.toIO (fun x => x.message)) <| ReaderT.run lazy rc
}]
In the above readDoc
gives an EditableDocument
from which we get a FileMap
, then a source
which is presumably what we want.
By the way, @0art0 it looks like mathport is now getting regularly updated and building. So feel free to update the lean-update branch (with basic checks) to use the latest code for code-actions (which has had many updates)
I was able to extract the last docstring in the file through code actions and FileMap.source
. I was also able to extract the smallest node of the InfoTree
containing the start position of the snapshot, as well as the correspondingSyntax
(this is a superset of what we need - which is node of the InfoTree
containing the current position).
Looks good. May be convenient to make test code-actions that just paste the code retrieved (maybe in a comment) and succssively to get pieces of it. I mean refine towards code actions to get the query string without worrying about the OpenAI call, prompt etc (which will be easy once one gets the query string).
I agree. The code action I wrote currently just gets the body of the doc-string and pastes it in the editor.
This is the code I wrote. I will also experiment further with the InfoTree
approach (since I have most of the code already and it may be more robust) and finally push a fully working version to lean-update
on Monday.
open RequestM in
@[codeActionProvider]
def readFile : CodeActionProvider := fun params _snap => do
let doc ← readDoc
let text := doc.meta.text
let source := text.source
let pos := text.lspPosToUtf8Pos params.range.end
let edit : TextEdit := {
range := params.range
newText :=
let comments := source.splitOn "/-" |>.reverse
let comment := comments.head!.dropRight 2
comment
}
let ca : CodeAction := {title := "Translate comment to a Lean theorem", kind? := "quickfix"}
return #[{eager := ca, lazy? := some $ return { ca with edit? := WorkspaceEdit.ofTextEdit params.textDocument.uri edit}}]
@0art0 I tried the above (in a different project) and it works very nicely.
My suggestion is that you factor things so that a function takes source
and pos
as arguments and returns text to insert. This is to allow handling variants such as user cliking before a comment, inserting theorem with title etc.
@0art0 Here is the design I feel will work well for the main code action:
-/
; if not give a warning and do nothing (or paste a warning in comment during a debugging phase); if so dropRight 2
./-
to detect if we have either a comment or a docstring; if no split then an error, otherwise look at the last piece and distinguish by whether it starts with -
(coming from /--
).example <the completion> := by sorry
A second code action can be generating docstrings. Here the focus is assumed to be the beginning of a theorem, and we take the head after splitting by ":=". I have pushed code for a function statementToDoc
to do the querying for this.
Here are few minor features in addition to the main design that I think may be useful:
/-- <comment> -/ theorem
or /-- <comment> -/ def
patterns and triggering either theorem translation or definition translation./-- <comment> -/ theorem <name>
patterns to use the specified name rather than one generated by Codex.:= sorry
after the translation has been pasted into the editor.Here are few minor features in addition to the main design that I think may be useful:
- Detecting
/-- <comment> -/ theorem
or/-- <comment> -/ def
patterns and triggering either theorem translation or definition translation.- Detecting
/-- <comment> -/ theorem <name>
patterns to use the specified name rather than one generated by Codex.- Moving the user's cursor to the
:= sorry
after the translation has been pasted into the editor.
Agreed, but the first may be blocked till we have definitions in the database. I am looking for a demo product on the timescale of a couple of weeks.
The second looks feasible (and is a good idea), as also allowing clicking before and in the middle of a comment/docstring. As for the third I do not know how easy it is. We can discuss though what are the priority features for the present.
I agree, it may be best to leave out these additional features for the demo product. These may also become easier to implement later on as more high-level API surrounding widgets and code actions gets added.
@0art0 There is a technical issue we will face, but looks like there is a clean solution (was trying something else when I realized this).
TermElabM Expr
and we want IO String
delaborators
(which I use in many places) that will give TermElabM String
CoreM String
by using run'
twice - something done before calling by all my command line programs.CoreM String
to IO String
(via EIO
) - in the command line programs this is done by building environments.RequestM.runCore
, which calls Snapshot.runCore
. Looks like the best bet is to run this from snap
picking up parameters from the Request environment (same ones we are using already).run'
invocations (to reduce to CoreM) instead of dummy filenames we may want to use stuff picked from the environment.For doc-string comments, I managed to use the InfoTree
to extract the relevant information in the form of Syntax
. It works fine for all three kinds of patterns mentioned in the comment above, and works from anywhere in the doc-string. However, it does not work for plain comments (of the kind /- ... -/
).
The core part of the code is:
match snap.infoTree.findInfo? (·.tailPos?.any (· ≥ snap.endPos)) with
| some info => info.stx.reprint.get!
| none => "failed to find info"
Leo had mentioned in a conversation that comments are not part of the syntax tree if they are at the end. He had said that Lean 4 could be changed to accommodate this, but if we request this we will do so at a much later stage.
It is nice if things can be picked up from anywhere in the doc-string, but one should make sure the range
is set at the end of the doc-string (using the source-info in syntax and converting to Lsp positions).
@0art0 There is a technical issue we will face, but looks like there is a clean solution (was trying something else when I realized this).
- the translations after correction, filtering etc end up in
TermElabM Expr
and we wantIO String
- firstly, there are
delaborators
(which I use in many places) that will giveTermElabM String
- this is reduced to
CoreM String
by usingrun'
twice - something done before calling by all my command line programs.- the tricky part I thought would be to run from
CoreM String
toIO String
(viaEIO
) - in the command line programs this is done by building environments.- luckily there is a function
RequestM.runCore
, which callsSnapshot.runCore
. Looks like the best bet is to run this fromsnap
picking up parameters from the Request environment (same ones we are using already).- indeed, in the first to
run'
invocations (to reduce to CoreM) instead of dummy filenames we may want to use stuff picked from the environment.
I just pushed a function running to Core for when needed
I have added the feature of pasting at the end of the line, and it seems to be working correctly.
Thanks. I will now try to create a full working prototype on a branch of lean-update
.
@0art0 since I have been bitten by this a couple of times,
Mathbin.All
as an importMathbin.All
as import.Thanks. I was puzzled by why the translations were failing. I will create a separate file.
Realized that what may be breaking in binport is the delaborator. This can be bypassed if it is the only problem. Will check if more basic things are fine.
Realized that what may be breaking in binport is the delaborator. This can be bypassed if it is the only problem. Will check if more basic things are fine.
Indeed a simple code-action shows that there is no fundamental problem. Will work on avoiding delaboration (by just remembering the syntax that worked) and test.
It is good to know that the code actions can be used with Mathbin
after all. I have pushed an outline for the approach to code actions by parsing strings.
The bad news is that bypassing delab using syntax did not fix the weird binport errors.
It is good to know that the code actions can be used with
Mathbin
after all. I have pushed an outline for the approach to code actions by parsing strings.
It is better not to use the snapshot for positions - as I usnderstand it is for internal use to avoid recomputing. Working with the source text and positions (like my getting the tail) is likely to be more robust.
We should try to get things working in one case. Can look to cover more cases later.
@siddhartha-gadgil I have added code for extracting the comment nearest to a given position in the document. It works when the cursor is anywhere in the comment, or even slightly before or after. I would like to make a few more improvements:
Syntax
parsing in addition to string parsingextractCommentText
function more robust.@0art0 before making the improvements, can you try to call the translation function and paste the translation instead of the comment? This is so that we have something that works fully first before adding requirements for additional pieces.
I see that it works, but to add to the above have it appended with a newline and enclosed in example
and := by sorry
to make valid code? Just want to see a full working example.
@0art0 I did this. The base case looks good now.
You are welcome to try the other cases, just making sure there is no regression.
@0art0 Good news :smile: Just tried the code action with binport and it worked fine.
I was able to confirm it on my set-up too (after rebuilding mathlib3port
). However, I couldn't see any recent changes to the code. Was it the delaboration fix from earlier that made it work?
I have added a feature replacing comments of any kind with doc-strings after translation.
I have added a feature replacing comments of any kind with doc-strings after translation.
Looks good :octopus:
I was able to confirm it on my set-up too (after rebuilding
mathlib3port
). However, I couldn't see any recent changes to the code. Was it the delaboration fix from earlier that made it work?
Actually it seems that our "intermediate" test code failed while the final code worked. As binport is a hack some things like this perhaps have no good reason.
Mentioning it so I don't forget: in the case where there are no valid translations, it may best for the user to give the auto-corrected syntax for the first translation in the generated list (more useful than sorries).
In the longer run we can still do this, but with integration with infoview.