Open Kha opened 7 years ago
I believe that the best representation is whatever will be the syntax type from #1674.
I agree.
Ok, as far as I can tell there are 3 separate issues here:
Hover and go-to-definition in error messages. IMHO this mostly depends on structured error/trace messages as in #1692. We can't do much if we just have strings. Once we have them, we could theoretically even use the current protocol--just ask for info in the "file" message:file.lean!123456
, where 123456
is the id of an expression in a message.
Use syntax
objects instead of break_at_pos_exception
. This depends on #1674. I agree that we should port info and auto-completion to use the syntax
objects.
Transfer information eagerly to the editors. Technically we could do this right now, by serializing all info managers. The main challenge here is the protocol--I think the performance would be unacceptable if we transfer all information on every change. Aside from semantic highlighting, I don't see what this change would buy us.
Maybe I got the wrong impression, and you propose to only send information eagerly for error and trace messages. I'm not sure if this would work efficiently with our current approach, where we resend all error messages on every (server-side) change. Since this will be a breaking change for the server protocol anyhow, we can change the approach. For example, we could only send lists of message ids to the client, and add an extra command to resolve them (the client only needs to resolve and rerender messages with new ids).
Aside from semantic highlighting, I don't see what this change would buy us.
@gebner One issue is the goal visualization buffer. In the new approach, we would also store information for the terms in the goal. The info manager would discard the information associated with the goal whenever the region associated with the goal is invalidated. This sounds reasonable. However, we have the following scenario 1- Open goal buffer and freeze it. 2- Keep editing proof (the info manager will discard the data associated with the goal). 3- Request information for terms in the goal (the server would report that the information is not available anymore). By storing cheap information there (e.g., resolved names), we would still support some functionality.
That being said, after I wrote the message above, I think the extra complication is not worth. We would still get "information not available" error messages when requesting for more complicated information.
On the slack channel, we also discussed the possibility of adding "purge" messages from editor to server. That is, the editor would notify the server when the information for the goal can be discarded, but this would complicate the editor side.
On the slack channel, we also discussed the possibility of adding "purge" messages from editor to server. [..] this would complicate the editor side.
We only need to do this for the dedicated goal windows. I think the biggest issue there is that we could accidentally forget to dispose a goal and cause a memory leak.
There is also the highly hacky solution of having the server automatically preserve the most recently requested goal. This would require no editor cooperation and there is no chance of forgetting to purge/dispose a goal.
There is also the highly hacky solution of having the server automatically preserving the most recently requested goal. This would require no editor cooperation and there is no chance of forgetting to purge/dispose a goal.
This doesn't seem too bad. We may also preserve the last k
goals.
There is also the highly hacky solution of having the server automatically preserve the most recently requested goal.
I initially rejected this solution for some reason -- I think it was about freezing the Next Error view, something not even implemented right now. So yes, sounds good enough to me.
I initially rejected this solution for some reason
Ah, it probably was about the info
command being used for more than just goal states, but I'm okay with splitting the API.
So it does look like we're mostly missing structured messages to implement the on-demand scheme. I was thinking of using serialized format
objects for that before #1692 was opened, which would finally allow clients to reformat text depending on the window width. However, terms as format
s would complicate position computations and seems to conflict with the idea of returning them as syntax
.
Aside from semantic highlighting, I don't see what this change would buy us.
Btw, having highlighting information in the term would also be great for correctly highlighting only the code parts of a message.
So it does look like we're mostly missing structured messages to implement the on-demand scheme. I was thinking of using serialized format objects for that before #1692 was opened, which would finally allow clients to reformat text depending on the window width.
@Kha this is cool. The server just needs to provide the width when asking for information.
However, terms as formats would complicate position computations and seems to conflict with the idea of returning them as syntax.
Once we implement both #1692 and #1674, we can just add a message.from_syntax
case.
In any case, I don't think we should transfer literal syntax objects to the editors. At the very least, we need to add information on how to turn them into strings. I'm not sure we want to convert expressions into syntax either, this conversion would lose implicit arguments.
The position information can be as synthetic as we want it to be, it doesn't have to correspond to the position on the screen as long as the editor converts it back.
Btw, having highlighting information in the term would also be great for correctly highlighting only the code parts of a message.
This would already be solved be transferring structured messages, where the expressions are tagged (see message.from_expr
).
Once we implement both #1692 and #1674, we can just add a message.from_syntax case.
Wouldn't that still be a static (text width-independent) layout?
In any case, I don't think we should transfer literal syntax objects to the editors. At the very least, we need to add information on how to turn them into strings. I'm not sure we want to convert expressions into syntax either, this conversion would lose implicit arguments.
I assumed that turning a syntax object into a string would be as easy as concatenating all node string contents, including their whitespace prefixes, and that showing implicit arguments would be a server-backed operation. Having said that, I'm currently leaning toward using format
for expressions (as the pretty printer already does, the format objects just don't reach the client right now).
The position information can be as synthetic as we want it to be, it doesn't have to correspond to the position on the screen as long as the editor converts it back.
Okay, then we need to annotate the output with the synthetic positions, e.g. a constructor expr : pos -> format -> format
that both identifies term parts in the overall format object and holds positions of subterms.
This would already be solved be transferring structured messages, where the expressions are tagged (see message.from_expr).
As far as I know, there is no good way to apply font-lock
highlighting to parts of a text in emacs, so the highlighting information would have to come from the server. Also, I assumed from_expr
would be server-side only.
[Once we implement both #1692 and #1674, we can just add a message.from_syntax case.] Wouldn't that still be a static (text width-independent) layout?
I really don't know how we should transfer syntax/expr objects to the client. This is a breaking change for the server protocol, and no concrete plans have been made yet. So it could be a plain format object (that we can adapt to the window width on the client), a format object with some annotations (for implicit arguments), or something else entirely.
I assumed that turning a syntax object into a string would be as easy as concatenating all node string contents
I'm more and more thinking that syntax objects should not include the source code text. We'd still include comments, but as a separate leading_comment/trailing_comment node.
The rationale is the following: we want to implement transformations of syntax objects. For example, macro expansions, refactoring, or hole synthesis. In all of those cases we either don't have any source text or modify it. Modifying both the syntax object and the associated source code seems to provide little benefit. Having the original source doesn't help if we modify the syntax object.
As far as I know, there is no good way to apply font-lock highlighting to parts of a text in emacs, so the highlighting information would have to come from the server.
Since we've switched to the HTML info view in vscode, we're in a similar position. Either we implement our own syntax highlighting, or we obtain it from the server.
Also, I assumed from_expr would be server-side only.
It is server-side in the sense that we would not transfer the expression literally to the client, but a rendered version. I assumed that we would tag it as an expression when converting it to json.
Environments such as idris-mode allow interacting with code in error messages. Ideally, we would like to provide the same set of interactions in code in messages and info views as in the main editor.
So far, all interactive commands are executed on demand by the server based only on location (filename + position). Especially for commands that may be executed on arbitrary subexpressions (e.g. "show type", "show implicit arguments"), we should keep the on-demand behavior even in pp outputs, using some synthetic location identifiers. If the file from which the pp output was derived has been invalidated in the meantime, the location identifier should be considered invalid as well and the command aborted. Commands that do not depend on a location but only on the current environment (e.g. "show type of this constant") may reasonably be allowed to forgo invalidation checking.
On the other hand, there is a good amount of "cheaper" information that can and should be sent inline with the pp output such as name resolution and -- most fundamentally -- the kind and extent of a token. The latter information is currently extracted by reparsing the file up to the cursor position, which is neither efficient nor elegant. So this information would also be useful for the main editor, even if sending the full input back (in annotated form) may sound wasteful. It can further be used for semantic highlighting.
Because we would need more structure in this output than a simple token stream for commands like "show implicit arguments" or "show goal (of tactic at point)", I believe that the best representation is whatever will be the
syntax
type from https://github.com/leanprover/lean/issues/1674. For the main editor, we would parse the input to a syntax object and return it, and for messages we would make the pretty printer synthesize a syntax object directly and embed it in aformat
constructor.(A probably small selection of) open questions: