eclipse-theia / theia

Eclipse Theia is a cloud & desktop IDE framework implemented in TypeScript.
http://theia-ide.org
Eclipse Public License 2.0
19.94k stars 2.49k forks source link

Runaway ripgrep (rg) processes consume all available CPU #13712

Closed danepowell closed 4 months ago

danepowell commented 4 months ago

Bug Description:

Soon after starting Theia, several rg processes start running. Within just a few minutes and no further interaction, these balloon to dozens or hundreds of rg processes that eventually consume all available resources and make Theia unresponsive.

The processes in question are all in a sleeping state, indicating they are blocked on i/o. Running strace on a stuck rg process produces nonsensical output (it indicates it's writing 140 terabytes to a negative file descriptor):

root@74ef8d51a3ef:/# strace -p 4101
strace: Process 4101 attached
write(-18875552, 0x1, 140737469479776)  = 0
write(-18875552, NULL, 0)               = 0
strace: [ Process PID=4101 runs in x32 mode. ]
syscall_0x7fffffac82a8(0x7fffffe77ee7, 0x7fffffe763b4, 0x7fffffe89af8, 0, 0, 0xd7) = 0
syscall_0xeffff7dfaaa0(0xeffff7dfaaa4, 0x10204, 0, 0, 0xeffff7cc9000, 0xde) = 0xf000109a0890
strace: [ Process PID=4101 runs in 64 bit mode. ]
write(0, "", 0)                         = ?
+++ exited with 0 +++

This only started to occur in Theia 1.48.0 and rolling back to 1.47.1 fixes it. I suspect this commit is the root cause just based on the description: https://github.com/eclipse-theia/theia/pull/13498

It's possible this is related to the number of files in the workspace. We've mostly seen it in workspaces with ~100k files.

Steps to Reproduce:

  1. Run Theia 1.48.0
  2. Clone a large project (~100k files)
  3. Wait for rg to ruin your day

Additional Information

msujew commented 4 months ago

cc @AlexandraBuzila @planger

AlexandraBuzila commented 4 months ago

Hi @danepowell

Thank you for the report. I tried to reproduce the issue on Ubuntu 22.04.4 with Theia 1.48.0 (electron app started from sources), latest Theia master and version 1.49.101 started from the AppImage, but wasn't successful.

I can see the rg process being spawned, but for me it always terminates shortly after being started.

Steps I took to test:

Are you doing something different when the problem occurs? Does this always happen for one workspace on your end? I'm wondering if some of the file names might play a role.

Thanks!

msujew commented 4 months ago

@AlexandraBuzila with https://github.com/eclipse-theia/theia/pull/13498 in mind, this might only happen in case a bunch of text is printed into the terminal. I'm not sure though, I'm only speculating based on the changes I've seen.

AlexandraBuzila commented 4 months ago

Thanks @msujew, you're absolutely right. In such a large workspace, the terminal becomes quite slow. I don't see any stuck rg processes, but there can be quite a number of them depending on the terminal contents. I will have a look.

profbbrown commented 4 months ago

This happened to me, too. #13576 No obvious cause of the behavior.

AlexandraBuzila commented 4 months ago

I changed the code in https://github.com/eclipse-theia/theia/pull/13735, the workspace search service (which in turn starts rg) should now only be triggered when a link is actually clicked. It was a bad idea to use it each time something was being hovered in the terminal.

@msujew would you have time for a review? Thank you!

danepowell commented 4 months ago

I'd love to test this; if I'm building my app based on the Theia blueprint, is there a way to test specific PRs? I'm not sure how to build anything other than a semver release in package.json.

msujew commented 4 months ago

@danepowell You can checkout the branch associated to the PR and build and run Theia. You can see the instructions to build & run here. In order to consume it in a downstream app, you could publish it to a local npm repository like verdaccio, but that is likely a bit of effort.