siddhartha-gadgil / LeanAide

Tools based on AI for helping with Lean 4
Apache License 2.0
62 stars 2 forks source link

Using codeActions #5

Closed siddhartha-gadgil closed 1 year ago

siddhartha-gadgil commented 1 year ago
siddhartha-gadgil commented 1 year ago

The relevant pull request is https://github.com/leanprover/lean4/pull/1661

siddhartha-gadgil commented 1 year ago

I missed the fact that https://github.com/leanprover/lean4/pull/1661 has been merged four weeks ago. We should look at this carefully and design our interface to use it.

Note that lazy code actions are part of the API. So we do not have to worry about tirggering while editing.

0art0 commented 1 year ago

It is really nice that lazy actions are supported by default. I'll take a look at the PR and try a few experiments imitating my lean3 code on the lake-update branch.

siddhartha-gadgil commented 1 year ago

I have been trying various experiments with the test file "codeactions.lean" and asking Ed Ayers things. It looks like what we get easily is a position in a file. We can then read the file and work with it.

Given this, a good strategy may be to have a code action that picks up a comment before the present position (though after may be easier in terms of parsing). This can be pretty convenient to a user.

siddhartha-gadgil commented 1 year ago

I mean that we have no special syntax. When the code action is called just read, parse looking for a comment preceding the position and translate it. Following may be easier to parse directly, but I see that comment markers are palindromic so we can reverse.

0art0 commented 1 year ago

We will also need to account for the different kinds of comments /-- -/, /-! -/, /- -/ (so they may not always be perfectly palindromic). I will take a look at how the InfoTree stores the information about comments.

siddhartha-gadgil commented 1 year ago

What I meant was that we can try to:

You are correct that we have to deal with stuff that does not look fully like comments. But once we are alert to this it is easy.

siddhartha-gadgil commented 1 year ago

But InfoTree may be more principled.

siddhartha-gadgil commented 1 year ago

The question is do you know how to get the InfoTree from within a document?

siddhartha-gadgil commented 1 year ago

Okay, I see a getInfoTrees function. But it still seems non-trivial at IO level to get the InfoTree and work with it. Working with just the text may be easier.

siddhartha-gadgil commented 1 year ago

I am writing as I think, but here is my latest attempt at a simple implementation:

0art0 commented 1 year ago

No, I have not yet figured out how to operate with the InfoTree. I agree, the string parsing option seems simpler - I will try that one first.

siddhartha-gadgil commented 1 year ago

Stuff in the following thread may be useful: https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/Syntax.20pos.20to.20LSP.20pos/near/310623959

0art0 commented 1 year ago

Thanks, I will take a look. So far I have managed to extend the range to the entire line (not just the position where the hole command was invoked).

siddhartha-gadgil commented 1 year ago

We don't want holes. One should just invoke a code action. I am asking on Zulip how to get the file.

siddhartha-gadgil commented 1 year ago

Ed Ayers has a solution to build on:

@[codeActionProvider]
def xyz : CodeActionProvider := fun params snap => do
  let rc : RequestContext ← read
  let ca : CodeAction := {
    title := "xyz"
  }
  let lazy : RequestM CodeAction := do
        -- this is a requestM
        let doc ← RequestM.readDoc
        return ca

  return #[{
    eager := ca
    lazy? := some <| (EIO.toIO (fun x => x.message)) <| ReaderT.run lazy rc
  }]

https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/code.20action.20behaviour.3F/near/310845415

siddhartha-gadgil commented 1 year ago

In the above readDoc gives an EditableDocument from which we get a FileMap, then a source which is presumably what we want.

siddhartha-gadgil commented 1 year ago

By the way, @0art0 it looks like mathport is now getting regularly updated and building. So feel free to update the lean-update branch (with basic checks) to use the latest code for code-actions (which has had many updates)

0art0 commented 1 year ago

I was able to extract the last docstring in the file through code actions and FileMap.source. I was also able to extract the smallest node of the InfoTree containing the start position of the snapshot, as well as the correspondingSyntax (this is a superset of what we need - which is node of the InfoTree containing the current position).

siddhartha-gadgil commented 1 year ago

Looks good. May be convenient to make test code-actions that just paste the code retrieved (maybe in a comment) and succssively to get pieces of it. I mean refine towards code actions to get the query string without worrying about the OpenAI call, prompt etc (which will be easy once one gets the query string).

0art0 commented 1 year ago

I agree. The code action I wrote currently just gets the body of the doc-string and pastes it in the editor.

This is the code I wrote. I will also experiment further with the InfoTree approach (since I have most of the code already and it may be more robust) and finally push a fully working version to lean-update on Monday.

open RequestM in
@[codeActionProvider]
def readFile : CodeActionProvider := fun params _snap => do
  let doc ← readDoc
  let text := doc.meta.text
  let source := text.source
  let pos := text.lspPosToUtf8Pos params.range.end
  let edit : TextEdit := {
    range := params.range
    newText :=
      let comments := source.splitOn "/-" |>.reverse
      let comment := comments.head!.dropRight 2
      comment
  }
  let ca : CodeAction := {title := "Translate comment to a Lean theorem", kind? := "quickfix"}
  return #[{eager := ca, lazy? := some $ return { ca with edit? := WorkspaceEdit.ofTextEdit params.textDocument.uri edit}}]
siddhartha-gadgil commented 1 year ago

@0art0 I tried the above (in a different project) and it works very nicely.

My suggestion is that you factor things so that a function takes source and pos as arguments and returns text to insert. This is to allow handling variants such as user cliking before a comment, inserting theorem with title etc.

siddhartha-gadgil commented 1 year ago

@0art0 Here is the design I feel will work well for the main code action:

A second code action can be generating docstrings. Here the focus is assumed to be the beginning of a theorem, and we take the head after splitting by ":=". I have pushed code for a function statementToDoc to do the querying for this.

0art0 commented 1 year ago

Here are few minor features in addition to the main design that I think may be useful:

siddhartha-gadgil commented 1 year ago

Here are few minor features in addition to the main design that I think may be useful:

  • Detecting /-- <comment> -/ theorem or /-- <comment> -/ def patterns and triggering either theorem translation or definition translation.
  • Detecting /-- <comment> -/ theorem <name> patterns to use the specified name rather than one generated by Codex.
  • Moving the user's cursor to the := sorry after the translation has been pasted into the editor.

Agreed, but the first may be blocked till we have definitions in the database. I am looking for a demo product on the timescale of a couple of weeks.

The second looks feasible (and is a good idea), as also allowing clicking before and in the middle of a comment/docstring. As for the third I do not know how easy it is. We can discuss though what are the priority features for the present.

0art0 commented 1 year ago

I agree, it may be best to leave out these additional features for the demo product. These may also become easier to implement later on as more high-level API surrounding widgets and code actions gets added.

siddhartha-gadgil commented 1 year ago

@0art0 There is a technical issue we will face, but looks like there is a clean solution (was trying something else when I realized this).

0art0 commented 1 year ago

For doc-string comments, I managed to use the InfoTree to extract the relevant information in the form of Syntax. It works fine for all three kinds of patterns mentioned in the comment above, and works from anywhere in the doc-string. However, it does not work for plain comments (of the kind /- ... -/).

The core part of the code is:

match snap.infoTree.findInfo? (·.tailPos?.any (· ≥ snap.endPos)) with
  | some info => info.stx.reprint.get!
  | none => "failed to find info"
siddhartha-gadgil commented 1 year ago

Leo had mentioned in a conversation that comments are not part of the syntax tree if they are at the end. He had said that Lean 4 could be changed to accommodate this, but if we request this we will do so at a much later stage.

It is nice if things can be picked up from anywhere in the doc-string, but one should make sure the range is set at the end of the doc-string (using the source-info in syntax and converting to Lsp positions).

siddhartha-gadgil commented 1 year ago

@0art0 There is a technical issue we will face, but looks like there is a clean solution (was trying something else when I realized this).

  • the translations after correction, filtering etc end up in TermElabM Expr and we want IO String
  • firstly, there are delaborators (which I use in many places) that will give TermElabM String
  • this is reduced to CoreM String by using run' twice - something done before calling by all my command line programs.
  • the tricky part I thought would be to run from CoreM String to IO String (via EIO) - in the command line programs this is done by building environments.
  • luckily there is a function RequestM.runCore, which calls Snapshot.runCore. Looks like the best bet is to run this from snap picking up parameters from the Request environment (same ones we are using already).
  • indeed, in the first to run' invocations (to reduce to CoreM) instead of dummy filenames we may want to use stuff picked from the environment.

I just pushed a function running to Core for when needed

0art0 commented 1 year ago

I have added the feature of pasting at the end of the line, and it seems to be working correctly.

0art0 commented 1 year ago

Thanks. I will now try to create a full working prototype on a branch of lean-update.

siddhartha-gadgil commented 1 year ago

@0art0 since I have been bitten by this a couple of times,

0art0 commented 1 year ago

Thanks. I was puzzled by why the translations were failing. I will create a separate file.

siddhartha-gadgil commented 1 year ago

Realized that what may be breaking in binport is the delaborator. This can be bypassed if it is the only problem. Will check if more basic things are fine.

siddhartha-gadgil commented 1 year ago

Realized that what may be breaking in binport is the delaborator. This can be bypassed if it is the only problem. Will check if more basic things are fine.

Indeed a simple code-action shows that there is no fundamental problem. Will work on avoiding delaboration (by just remembering the syntax that worked) and test.

0art0 commented 1 year ago

It is good to know that the code actions can be used with Mathbin after all. I have pushed an outline for the approach to code actions by parsing strings.

siddhartha-gadgil commented 1 year ago

The bad news is that bypassing delab using syntax did not fix the weird binport errors.

siddhartha-gadgil commented 1 year ago

It is good to know that the code actions can be used with Mathbin after all. I have pushed an outline for the approach to code actions by parsing strings.

It is better not to use the snapshot for positions - as I usnderstand it is for internal use to avoid recomputing. Working with the source text and positions (like my getting the tail) is likely to be more robust.

We should try to get things working in one case. Can look to cover more cases later.

0art0 commented 1 year ago

@siddhartha-gadgil I have added code for extracting the comment nearest to a given position in the document. It works when the cursor is anywhere in the comment, or even slightly before or after. I would like to make a few more improvements:

siddhartha-gadgil commented 1 year ago

@0art0 before making the improvements, can you try to call the translation function and paste the translation instead of the comment? This is so that we have something that works fully first before adding requirements for additional pieces.

siddhartha-gadgil commented 1 year ago

I see that it works, but to add to the above have it appended with a newline and enclosed in example and := by sorry to make valid code? Just want to see a full working example.

siddhartha-gadgil commented 1 year ago

@0art0 I did this. The base case looks good now.

siddhartha-gadgil commented 1 year ago

You are welcome to try the other cases, just making sure there is no regression.

siddhartha-gadgil commented 1 year ago

@0art0 Good news :smile: Just tried the code action with binport and it worked fine.

0art0 commented 1 year ago

I was able to confirm it on my set-up too (after rebuilding mathlib3port). However, I couldn't see any recent changes to the code. Was it the delaboration fix from earlier that made it work?

0art0 commented 1 year ago

I have added a feature replacing comments of any kind with doc-strings after translation.

siddhartha-gadgil commented 1 year ago

I have added a feature replacing comments of any kind with doc-strings after translation.

Looks good :octopus:

siddhartha-gadgil commented 1 year ago

I was able to confirm it on my set-up too (after rebuilding mathlib3port). However, I couldn't see any recent changes to the code. Was it the delaboration fix from earlier that made it work?

Actually it seems that our "intermediate" test code failed while the final code worked. As binport is a hack some things like this perhaps have no good reason.

siddhartha-gadgil commented 1 year ago

Mentioning it so I don't forget: in the case where there are no valid translations, it may best for the user to give the auto-corrected syntax for the first translation in the generated list (more useful than sorries).

In the longer run we can still do this, but with integration with infoview.