sourcegraph / jetbrains

Apache License 2.0
81 stars 25 forks source link

bug: withAgent should have a timeout and user feedback if it's failing #1398

Closed steveyegge closed 14 hours ago

steveyegge commented 7 months ago

Cody Version

all JetBrains versions

IDE Information

Any time an operation calling withAgent fails to complete for some reason on the backend, it hangs forever. Sometimes this results in a panel not updating (e.g. the Account Tier panel being stuck on the wrong tier), but sometimes (e.g. when calling IgnoreOracle for a policy) it will deadlock the UI thread.

We need to visit every single one of these locations in the code, and put in better timeout and error reporting (and recovery) logic, before GA.

Describe the bug

You'll just start the IDE and it will hang, or Cody will stop working, or part of Cody will act strangely. The symptoms could be anything.

Expected behavior

I expect that any backend call will time out after a reasonable period (e.g. 30 seconds), be cancellable, and the user should know which operation is currently waiting. This is for our own debugging of Enterprise customers as well.

Additional context

No response

dominiccooney commented 7 months ago

Like it or not, the extension is stateful, the JVM side is stateful (for example a streaming chat completion on the extension side relates to rendering the stop generation button on the JVM side) so we at least need a signal to resync when the Agent is restarted.

dominiccooney commented 7 months ago

I'm looking at the IgnoreOracle hanging the UI thread now.

pkukielka commented 6 months ago

I'm not sure it is a P0. It's kind of known problem with architecture, major one, but it is not breaking anything specific. It's rather general improvement we could and should make. Fixing it won't be straightforward and will require making sure all the features we have works correctly after the changes. I would not expect we will decide to do it before GA (and especially right before GA).