Expose textDocument/completion execution time to clients

rgrunber commented 1 year ago

textDocument/completion response time is a fairly good single measurement to give an idea of the overall user experience when using JDT-LS. It would be nice to expose this more easily to clients wishing to improve this aspect of the experience. Prior to https://github.com/eclipse/eclipse.jdt.ls/pull/2639 we would send this value in the response to textDocument/completion for each CompletionItem in the list (a bit excessive, so it was changed)

Currently the time it took for textDocument/completion to execute is sent in a CompletionItem.data field to those contributing ICompletionRankingProvider. It seems easier to just take it from https://github.com/eclipse/eclipse.jdt.ls/blob/40a8c271e66c6fa419c2a72dd5df38108ee625a7/org.eclipse.jdt.ls.core/src/org/eclipse/jdt/ls/core/internal/handlers/CompletionHandler.java#L124 though.

Some considerations :

Do we want to use telemetry/event (See TelemetryManager) to do this or should we somehow include it in the CompletionList response of textDocument/completion to avoid excessive communication ?
Assuming we stick to the protocol and use telemetry/event, should any rate limiting / aggregation occur in JDT-LS or is that the client's responsibility ?
Is there any other generic data about a completion item that might be relevant for making decisions on how to optimize this request ?

CC: @fbricon @testforstephen @robstryker

testforstephen commented 1 year ago

https://github.com/eclipse/eclipse.jdt.ls/blob/40a8c271e66c6fa419c2a72dd5df38108ee625a7/org.eclipse.jdt.ls.core/src/org/eclipse/jdt/ls/core/internal/handlers/CompletionHandler.java#L124

Pls remember that the completion handler only measures the time taken by the language server on calculating proposals and converting proposals to lsp4j result. It does not include the full end-to-end time, which also involves the request scheduling waiting time on the server and the json serialization/deserialization between the client and the server. To measure the accurate end-to-end time, it is better to track it from the client side.

On vscode-java, we provide an API to notify that a LSP request has finished. We can use this API to track the completion end-to-end time cost. See https://github.com/redhat-developer/vscode-java/blob/df200f1aa6bfab07b64dac13317175f39ac90079/src/TracingLanguageClient.ts#L7

Here is a sample code to track it in vscode-java. We need to exclude the completion noise, such as when typing a whitespace char for indentation, which also triggers a completion request.

const javaExt = vscode.extensions.getExtension("redhat.java");
javaExt.exports?.onDidRequestEnd((traceEvent: any) => {
  if (traceEvent.type === "textDocument/completion") {
         // See https://github.com/redhat-developer/vscode-java/pull/3010
         // to exclude the invalid completion requests.
         if (!traceEvent.resultLength && traceEvent.type === "textDocument/completion") {
            return;
         }
        ...
  }
});

rgrunber commented 1 year ago

I'm definitely open to this. Only downside I see is we'd be bypassing the telemetry/event spec approach. However, for just execution time, any client can easily derive this information (since it should have access to request/response logic), so it might be fine for now.

Only slight concern is if there's any data about a completion item in the future that we can't place into CompletionList/CompletionItem (not kind or anything like that), then that might force us to implement telemetry/event.

robstryker commented 1 year ago

Without being familiar with the protocol, it does seem to me that we would be more interested in finding specific bottlenecks in the completion handler itself, no?

While discovering the full end-to-end time ( request scheduling waiting time, json serialization/deserialization) might give a more accurate view of the full user experience, aren't those extra pieces shared by all potentially long-running tasks? And wouldn't the inclusion of them obscure or add variability to the times?

For example, if we intend to use this data to locate bottlenecks, but some requests have a 2ms delay for scheduling / serialization, while others get unlucky and get 10ms or something, that could obscure the data as to which are the bottlenecks? I understand that things would average out over time with the lucky / unlucky parts, but I have to imagine that the underlying completion handler data would be more directly usable?

jdneo commented 1 year ago

I think telemetry/event and the client-side measurement are not conflicting.

client-side measurement gives the whole picture about the responsiveness from the user side.

On the other hand, if we want to analyze the time consumption for a specific part (i.e. for a specific method), we can use telemetry/event.

It depends on what we are interested in. And we can definitely use them both.

testforstephen commented 1 year ago

Agree we can track them both.

For E2E, it's a more systematic approach to track performance on the framework level. You can easily apply it to any requests you're interested in.

For telemetry/event, we can use it to track the computation time used by the completion handler, which can help us identify and optimize the internal implementation. However, this data should only be for internal analysis.

When we communicate the code completion speed on the product level, we should definitely use the end-to-end time as the key metric.

robstryker commented 1 year ago

https://github.com/redhat-developer/vscode-java/pull/3165

eclipse-jdtls / eclipse.jdt.ls

Expose textDocument/completion execution time to clients #2685