Feature Request: Sentence Indexing

tom-p-reichel commented 1 year ago

The Problem

Many research applications currently using sertop run interactively: they need to see the output of a command before they decide what to do next, rather than running one chunk of coq code and interrogating the final result once.

This is possible in coq-lsp now with the aid of the following procedure:

Put some text in a document.
Issue a coq/getDocument request so that we know where all the sentences are.
Issue one proof/goals for each sentence yielded in step 2.

After we see the outputs of the sentences we have run, we may wish to append text to the document and check the output of the new sentences in the document. Unless we already know the sentence boundaries of the text we're adding, we must run coq/getDocument again, yielding the entire document's AST, which grows linearly with the length of the document. For a process that repeatedly grows a document and checks the output of new sentences, we incur quadratic communications overhead in terms of the final length of the document since we must query the entire document to know where the new sentences are every time they are added-- not good!

Automated processes in research projects are prone to creating very very large documents, so they will quickly run up against this overhead that doesn't exist in sertop.

Proposed Solution

Maybe we could have some sentence oriented requests that give the number of sentences in the document, give the text of a sentence by index, as well as allow proof/goals to take a sentence index? This would allow researchers to efficiently interact+append over coq documents.

ejgallego commented 1 year ago

Hi @tom-p-reichel , I agree that this is not good at all, thanks a lot for documenting your use case.

Indexing sentences is actually a bit more tricky than it looks, this is why in coq-lsp we try to have the "ground truth" of the document to be position based. Think of a notebook, or a markdown document with Coq inline, etc... LSP is position-based for this reason.

Even if a bit challenging to get rigth IMO, we could indeed explore having some better sentence id and have coq-lsp understand it.

How about this idea for your use case tho?

LSP already attaches to every document a version, so we could instead implement a getDocumentChanges that would return only the udpated sentences. How would that work for you folks?

We could also have an option for getDocument to return the proof directly thus avoiding that N roundtrips.

In the end I think the kind of interaction use cases you describe, the LSP protocol may not be the best choice; I've been thinking about alternatives. The engine at the core of coq-lsp, "Flèche" is designed to be mostly protocol agnostic.

ejgallego commented 1 year ago

Also as noted here https://coq.zulipchat.com/#narrow/stream/329642-coq-lsp/topic/textDocument.2FdidChange , it'd be nice for coq-lsp to support the incremental sync method.

tom-p-reichel commented 1 year ago

Yeah, I think a getDocumentChanges would be a good improvement for incremental document building.

getDocument returning proof/messages for everything would be a convenient addition (though the N round-trips are unavoidable in some cases because the proof is not known ahead of time and failures inform the next step). I saw that the output of getDocument might change to a tree soon, and I wouldn't be sure how to generate positions for proof/goals queries after that change is made. It would be nice to move that implementation detail to coq-lsp.

ejgallego commented 1 year ago

@tom-p-reichel I'll try to add that call to the next version then, with an option to also output additional info such as goals if present (note that this can be costly, printing is expensive :))

I saw that the output of getDocument might change to a tree soon, and I wouldn't be sure how to generate positions for proof/goals queries after that change is made. It would be nice to move that implementation detail to coq-lsp.

Nothing should change, the tree is just to provide some structure information, but still ground truth for the document is position-based.

ejgallego commented 3 months ago

Hi @tom-p-reichel , a couple of things could be of help for your use case:

the new on-demand checking mode provides better control of incrementality
you can also use petanque for interactions that won't update the document per-se.

ejgallego / coq-lsp

Feature Request: Sentence Indexing #496