onivim / oni

Oni: Modern Modal Editing - powered by Neovim
https://www.onivim.io
MIT License
11.35k stars 301 forks source link

High CPU Usage #1822

Open hkupty opened 6 years ago

hkupty commented 6 years ago

I've observed for the last release (0.3.1) a big CPU usage on when redrawing the screen. Idle usage keeps using around 2% CPU (which is strange TBH), but when I do something (say, move the cursor or split the screen, CPU usage for the same process can go up to 80% and then drop again.

Also, it is consistently observed that oni spawns and keeps 50+ processes open at the same time with very few actually doing something.

This can make code editing experience annoying and frustrating.

oni-bot[bot] commented 6 years ago

Hello and welcome to the Oni repository! Thanks for opening your first issue here. To help us out, please make sure to include as much detail as possible - including screenshots and logs, if possible.

akinsho commented 6 years ago

@bryphe in relation to oni and its processes I've been looking at execa which is a library which is a drop in replacement for child_process with some really great added advantages one of which is relevant here aka automatically kills child processes if the parent dies which cp doesnt not do

bryphe commented 6 years ago

Thanks for the report @hkupty - there's a few pieces of additional info that would be useful:

It would also be very helpful to have grab and save a performance profile - https://github.com/onivim/oni/wiki/Performance-Profiling - this can help us determine whats pegging the CPU.

One potential culprit could be the editor.textMateHighlighting.enabled flag - this was turned on by default in 0.3.1, and it can incur some extra CPU for determining the syntax highlighting. Would be worth trying to set that to false. (Animations can also cause the idle CPU usage you mentioned - you may want to try setting ui.animations.enabled to false)

Also, it is consistently observed that oni spawns and keeps 50+ processes open at the same time with very few actually doing something.

That amount of processes spawned is strange and seems like a bug... and could be related (spawning a process incurs CPU usage). Do you see which processes are being spawned? Are they language servers / etc? Are they being spawned during certain operations? We can get that information from the logs as well if you can attach it.

execa which is a library which is a drop in replacement for child_process with some really great added advantages one of which is relevant here aka automatically kills child processes if the parent dies which cp doesnt not do

Yes, we definitely need to improve our 'clean up' hygiene when closing Oni. @Akin909 - Do you suspect that some of the hanging processes are child processes of a parent process that was spawned by Oni, and they aren't going away when the parent dies? Curious if you've seen this for a particular process.

akinsho commented 6 years ago

apologies @hkupty for co-opting/not properly addressing your issue, @bryphe I think we previously discussed hanging process unfortunately I have no idea which issue that was, I primarily see it with the lsp where lang server hang around in the activity monitor, also a few unnamed node processes (whilst not running anything in my terminal). Also I'm not entirely clear on whether electron does all requisite clean up when oni is closed

hkupty commented 6 years ago

What OS are you using?

I'm using arch, with both nvim and oni built from aur.

What kind of file are you editing when you see the spike? Does it happen in any sort of file?

It happens on any file, even when opening splits on empty buffers.

Do you have any customizations to config.tsx or init.vim?

Although I have customizations, I've tested against no init.vim and produces the same result.

What plugins do you use with Oni / vim?

I've several plugins, but even without any plugin it behaves the same.

I'll try profiling and giving a better clue on that.

hkupty commented 6 years ago

I've found two things happening during profile: One is that it throws several exceptions as this one:

bundle.js:1 Error: No content available.
    at TypeScriptServerHost._parseResponse (TypeScriptServerHost.ts:339)
    at Interface.<anonymous> (TypeScriptServerHost.ts:84)
    at emitOne (events.js:115)
    at Interface.emit (events.js:210)
    at Interface._onLine (readline.js:278)
    at Interface._normalWrite (readline.js:420)
    at Socket.ondata (readline.js:138)
    at emitOne (events.js:115)
    at Socket.emit (events.js:210)
    at addChunk (_stream_readable.js:252)

The second thing is that even though I explicitly disabled quickinfo, it still shows up.

I've pasted the oni config I'm using here. There is no init.vim on the default directory.

justinmk commented 6 years ago

Idle usage keeps using around 2% CPU

For me this is isolated to the "Oni Helper" process (there's a separate "Oni" process at 0% idle).

Also I'm not entirely clear on whether electron does all requisite clean up when oni is closed

Nvim core dealt with a similar issue recently: https://github.com/neovim/neovim/pull/8107/commits/8d90171f8be6b92d7186ca84c42a0e07c2c71908

Is oni spawning processes using nodejs, or Nvim jobstart()/system()?

bryphe commented 6 years ago

Thanks for the extra info and for doing some profiling!

It turns out that we have multiple processes listed as Oni Helper - one is our renderer process (which is the actual window that runs the core of Oni - shell, canvas, neovim interaction, etc). Also, processes that we spin up during the course of execution are listed as Oni Helper, too.

One thing that can help significantly is figure out whether the high-CPU usage is due to our renderer process, or due to one of the processes we spawn.

If you open devtools (Control+Shift+P or Meta+Shift+P on Mac -> Open DevTools), you can get the renderer process by running: process.pid.

You can find the list of spawned processes by running Oni.process._spawnedProcessIds. Knowing whether it is the renderer process or one of our spawned processes is a good first step. For the spawned processes, it'd be helpful to get the command line used for it (that's potentially something we could/should log, but I think we can get it from the terminal too if we know the PID)

I've pasted the oni config I'm using here. There is no init.vim on the default directory.

@hkupty - That's a solid config for performance. I believe the issue may be related to what you mentioned earlier - that lots of processes were being spawned.

In addition, based on what you mentioned, if no files are open - the TypeScriptServerHost shouldn't even be active. This seems like a bug, that it is even starting up - definitely shouldn't be spun up for empty splits and things.

Is oni spawning processes using nodejs, or Nvim jobstart()/system()?

@justinmk - We use nodejs - spawn and exec (depending on whether it's a long-lived process or we just are running it for output). The code is here: https://github.com/onivim/oni/blob/5678f045df06acdb80a3ff6e26c1f8a9c082c648/browser/src/Plugins/Api/Process.ts#L90

Thanks for sharing the issue - I need to look and see how node handles this, it could be we're not configuring the spawned process correctly to be associated with the process group.

akinsho commented 6 years ago

@brype been trying to get snapshots of points of high memory usage as recently my Quad Core 2.8ghz (specs here just to point out that hopefully it isnt my machines limitations) mac has been experiencing really high cpu usages that result in system crashes, tbh not entirely sure whether this is down to only oni but I have some screens of high cpu usage related to oni although tbh shortly after it starts my laptop crashes, so hard to get associated PIDs. Keep in mind I very often have 2-3 oni windows open

screen shot 2018-04-05 at 11 16 43 screen shot 2018-04-04 at 17 17 52

In development:

screen shot 2018-04-04 at 20 55 20

Node process: (building oni)

screen shot 2018-04-04 at 20 18 15

Will keep attempting to profile and get the pids for the process

bryphe commented 6 years ago

Thanks for the help troubleshooting, @Akin909 !

@brype been trying to get snapshots of points of high memory usage as recently my Quad Core 2.8ghz (specs here just to point out that hopefully it isnt my machines limitations)

Yes, I think there is one question still remaining to be answered - whether the high CPU usage is due to the renderer process or a spawned process.

image

Would be great if we could find out if that PID 402 is the renderer process, or if it is something else (like the TypeScript language service, etc). We could add additional bookkeeping to https://github.com/onivim/oni/blob/master/browser/src/Plugins/Api/Process.ts - it might be helpful to have a command like getSpawnedProcessInfo that returns the PIDs and the command used to start them - that way we could grab that info real quick from the console like Oni.process.getSpawnedProcessInfo(), and it could tell us right away the PIDs + commands Oni is responsible for spawning (it would also help in the case reported above where lots of processes are started).

Another approach to tackling this is to start disabling features in Oni until the problem goes away (or even rollback to previous commits), and then bring them back. Removing plugins on both the Vim/Oni side could be helpful as well as part of this. Alternatively, doing a binary search (finding the 'last known good build') can help us track down a potential culprit - although its tough if the repro isn't consistent.

A couple potential culprits I'd suspect:

For development it's expected the memory usage and CPU will be higher, because there is much more verbose logging, and the dev tools for redux stashes a bunch of extra data (old states) - so that's a possible culprit there. But for production, it should be much better.

Thanks for the help troubleshooting, @Akin909 ! 👍

justinmk commented 6 years ago

Is there a way to "break" on the current instruction in the dev tools debugger? I tried hitting the "pause" button but it doesn't show any source code.

Or take a snapshot of the stacktrace somehow (I assume nodejs has a tool to do this).

During a hot path, breaking a few times in a debugger will reveal the problematic code after a few stacktrace snapshots.

bryphe commented 6 years ago

Is there a way to "break" on the current instruction in the dev tools debugger? I tried hitting the "pause" button but it doesn't show any source code.

Definitely, the pause button is the way to do it, but unfortunately it might not catch all cases (like, sometimes the bottleneck is in non-javascript code, like layout - you don't get a callstack for that).

The best bet for profiling in the renderer process is to use the performance tab while performing a heavy action: image

Those tools are actually really great now - they've come a loooong way from when I first had to profile JS - showing the frametime, a timeline of code execution, input / animation events, etc. You can see where 'hot paths' of the code are, and importantly determine where the bottleneck is (script code / layout / rendering / painting). I have some more info on a wiki page here: Performance Profiling

It works well if the bottleneck is in the renderer process, and there is a way to initiate a slow action, since you can start a profile and put it under the microscope. However, if the bottleneck is another process (like a language server), this won't be as helpful, since it won't show up in the renderer process's profile.

Once we are confident in which process is slow / experiencing problems, we could attach a profiler to that process as a next step (or try other things - for example, if a language server is running hot, it's worth looking at how we interact with it - even if it doesn't show up on our profile, it might be we're too chatty or sending something incorrectly that's causing it to throw/catch repeatedly, etc..)

akinsho commented 6 years ago

I'll have a closer look as some of this is sporadic and tbh v. recent haven't had any issue till a week or 2 ago, I'll try attaching the commands to the PID since I think longterm that'd be useful functionality to have and I'm often able to get the debugger open quick enough to get the list of PIDs but they aren't attached to anything.

Also this is entirely anecdotal but I think the problem seems to be worse when I have multiple oni instances open in ts projects.

Will have a go with the PID and commands and report back re the origin of the processes

bryphe commented 6 years ago

I'll try attaching the commands to the PID since I think longterm that'd be useful functionality to have and I'm often able to get the debugger open quick enough to get the list of PIDs but they aren't attached to anything.

Awesome, thanks! I agree, I think this would be useful to have in general 👍 Certainly would make it easier to troubleshoot these issues down the road.

Also this is entirely anecdotal but I think the problem seems to be worse when I have multiple oni instances open in ts projects.

Definitely possible this is related. I've seen cases before where the typescript language server can have problems in large projects / large # of node_modules... Are you still experimenting with theia's language server, or using the default one?

akinsho commented 6 years ago

I'm using the local one atm theia works quite will with diagnostics and is generally faster but an ongoing issue I have is that something I believe oni or theia does takes the lsp down repeatedly (which is a blocker for having theia on all the time) I havent determined a pattern and with tss server issues I've since discovered that they get reported on the typescript repo but dont seem to ever have details included of why or where the reporters are noting them.

hkupty commented 6 years ago

Is Language Server support toggleable? I've disabled auto completion and quickinfo (which still shows up though). I know that other Language Server features are reactive, but I think this is the only feature of oni I didn't disable to test the performance.

hkupty commented 6 years ago

I tested the latest version and is much more performant. Still, there is a big number of processes and I think there is still room for improvement. I'll keep this issue open.

akinsho commented 6 years ago

@hkupty in answer to

Is Language Server support toggleable? I've disabled auto completion and quickinfo (which still shows up though). I know that other Language Server features are reactive, but I think this is the only feature of oni I didn't disable to test the performance.

it isn't quite toggleable but if in a non-typescript file you set the lsp argument to an empty string the lsp won't start up although oni will continually check for an lsp so its not quite turned off.

whilst currently looking to get another lsp setup I've captured on a few circumstances where it seemed like the ts-lsp was requiring fairly high cpu usage, but I haven't had a chance to investigate properly

josemarluedke commented 6 years ago

I'm also seeing High CPU usage. I have seen it a lot when working with TS projects. Note, seem this with the latest version as well with explorer auto refresh disabled.

jeffbdavenport commented 6 years ago

Hey there guys, I've recently just started using ONI, and I am very excited about its goals, and I want to help pitch in when I can, but I have to say, I am monumentally frustrated with how unusable it is compared to using something like gvim due to the high CPU usage

If I open up a large file, even just a few hundred lines of Ruby code, even just holding down the down key to scroll to the bottom of the file will cause the entire computer to freeze up for 10s of seconds before continuing to scroll down.

I have a quad-core CPU, which isn't terrible and should be able to handle a text editor while it being the only thing it's running.

This is my CPU: Intel(R) Core(TM) i5-2540M CPU @ 2.60GHz

I have a minimalist setup, disabled everything I can think of with ONI

Here is my configuration:

"use strict";
exports.__esModule = true;
exports.activate = function (oni) {
    console.log("config activated");
    // Input
    //
    // Add input bindings here:
    //
    oni.input.bind("<c-enter>", function () { return console.log("Control+Enter was pressed"); });
    //
    // Or remove the default bindings here by uncommenting the below line:
    //
    // oni.input.unbind("<c-p>")
};
exports.deactivate = function (oni) {
    console.log("config deactivated");
};
exports.configuration = {
    //add custom config here, such as
    "ui.colorscheme": "none",
    //"oni.useDefaultConfig": true,
    //"oni.bookmarks": ["~/Documents"],
    //"oni.loadInitVim": false,
    //"editor.fontSize": "12px",
    //"editor.fontFamily": "Monaco",
    // UI customizations
    "editor.completions.enabled": false,
    "ui.animations.enabled": false,
    "ui.fontSmoothing": "none",
    "autoClosingPairs.default": [
        {
            "open": "\"",
            "close": "\""
        },
        {
            "open": "'",
            "close": "'"
        },
        {
            "open": "{",
            "close": "}"
        },
        {
            "open": "[",
            "close": "]"
        },
        {
            "open": "(",
            "close": ")"
        }
    ],
    "workspace.defaultWorkspace": "/home/user/project",
    "workspace.autoDetectWorkspace": "always",
    "oni.useDefaultConfig": false,
    "commandline.mode": false,
    "learning.enabled": false,
    "achievements.enabled": false,
    "editor.textMateHighlighting.enabled": false,
    "editor.typingPrediction": false,
};

I figured this must be due to excessive javascript running, so I ran a performance test in the console and got what I think is interesting results

All of this happened within just scrolling down a couple hundred lines of code:

image

image

image

image

image

image

It seems that nearly 70% of the processing time is spent in this handleKey function. From what I'm looking at, it seems that key events get fired off to do additional processing in the background, but the key events keep getting fired off to do the same thing hundreds of times when it's already being processed.

Almost none of that time is spent rendering.

It seems that there could be some logic here to figure out if what needs to be processed is already being done, but I'm not sure if I understand the inner workings enough to solve the problem myself.

ADDITIONAL INFO

public handleKey is on line 152 of https://github.com/onivim/oni/blob/master/browser/src/Services/InputManager.ts

It seems that on line 167 it is going through a list of potential keys on each and every one of these events

bryphe commented 6 years ago

Hey @jeffreydvp ,

Thanks a lot for the detailed and thoughtful investigation! Those perf traces are extremely helpful in tracking down the issue. Appreciate you taking the time to dive in and take a look 💯

Indeed those 'stacked' keydown events not what we want to see - that means that the keydown processing is taking so long that it extends into the next keydown event, and that will kill the responsiveness of the app.

In this particular case, in that handleKey method, we go over recent key presses to see if any of them matched chorded key bindings. However, we weren't putting a limit on that - so when you hold a key down, that array could grow very large which would cause the map and cloning to be come expensive. The array gets bigger and bigger as the keydown events stack up, too, until the CPU is pegged.

The fix is pretty simple - we just limit the amount of recent keys we check. With that change, I don't see the handleKey method show on the profile. Note that this is specific to this case of the key being held down. I'll send out a PR for that.

For the 'high-performance' config, I'd recommend using the WebGL renderer, too - it wouldn't necessarily help this issue as rendering isn't the bottleneck, but in general its significantly faster than our canvas renderer strategy:

"editor.renderer": "webgl",
jeffbdavenport commented 6 years ago

@bryphe Thanks for looking into this so quickly! I tried figuring out a way to bypass the method altogether in my own build, which fixed the problem completely, however, I was unable to do things such as using Ctrl+P to open a file, so I am glad you found the real problem!

bryphe commented 6 years ago

@jeffreydvp - teamwork! thanks a lot for the detailed traces 👍 Let me know if #2208 works for you.

jeffbdavenport commented 6 years ago

@bryphe It worked great!, So I am running into High CPU usage again, except this time it is simply just sitting there idling.

I did another performance scan

image

According to this it looks like 70% of the scripting that's being done is spent inside this oni-git plugin that appears to be updating the git branch indicator. How often does this happen? This definitely should not be happening often enough to cause this much CPU usage

bryphe commented 6 years ago

@jeffreydvp - thanks for this next round of performance investigation! Again, really appreciate you diving in and grabbing the trace 💯

This definitely is not expected behavior. It actually looks like it could be related to what @hkupty mentioned when opening this issue, and definitely not what we want to see.

From the trace, iIt's specific to the oni-plugin-git updating the git status (which it does by spinning up a shell process, running git rev-parse --abrev-ref HEAD, and parsing the output). I'll cc @Akin909 - he knows a bit more about the git integration than I do, and has been doing work on it (#1310), so he can correct me if I'm wrong on any points.

The plugin does this update in these cases:

        Oni.editors.activeEditor.onBufferEnter.subscribe(
            async evt => {
                console.log("DEBUG - onBufferEnter");
                await updateBranchIndicator(evt);
            })
        Oni.workspace.onFocusGained.subscribe(async buffer => {
            console.log("DEBUG - onFocusGained");
            await updateBranchIndicator(buffer);
        })

onBufferEnter should only be dispatched if we're moving to a new buffer, and onFocusGained should only be dispatched if the Oni window gains focus. It'd be great if we could narrow down which event is causing this.

So a couple of clarifying questions:

I'm first of all curious if there is a case where we are triggered these events aggressively, because aside from git, it could impact performance of other dependencies (like language services - we don't want to spam them with buffer enter events). Would be great if we could figure out why one of those events is being triggered so frequently when it's idle!

Secondly, _we need a more deterministic way to figure out if we should re-query git status - it'd be better if we could move from a 'polling' model to respond to some event. Some git plugins use a file system watcher on the .git folder, etc. In addition, we should not be querying for git status for non-git workspaces, (which is happening today). We should consider these in the work for #1310 - the interface specified in #2210 has a canHandleWorkspace method which helps with this latter issue.

A short-term fix, for now, would be to expose a configuration setting that would allow us to disable this plugin (I also want a way to disable plugins from the sidebar / configuration as part of our plugin management story - #186 - but we don't have that today, yet), but we also will need to fix those above two points.

@Akin909 , you've done a lot of work here too - any other ideas?

jeffbdavenport commented 6 years ago

@bryphe @Akin909, Yes I am using a workspace and it is a fairly large repo, about 300MB in total size. It seems to happen even if I am just working with a single file, but I will see if I can dig further

When I ran this performance scan, I only had 5 buffers open, and I wasn't switching between them or anything, I literally let the scan run for 60 seconds, while I was doing nothing

What was even more bizzare is the excessive CPU Usage was even happening while I was spending a significant amount of time with Oni in the background. (Not the active window)

akinsho commented 6 years ago

@jeffreydvp @bryphe i've been spending of a lot of time on the rework of the git functionality and am currently looking to stop it checking outside of a git repo which has been the case for a while unfortunately although I didnt realise it was so expensive it ought only to check during the events @bryphe pointed out aka onBufferEnter on onFocusGained, one thing to note is that we use exec but the process(es) aren't killed on exit.

As to why its running so often I wonder if as @bryphe mentioned theres a scenario where the events are being triggered v. frequently and we end up with a bunch of promises and a bunch of processes.

I'm currently looking into the how and when the plugin should work as I'm adding some initial sidebar functionality, and am looking to make sure the plugin only runs in a repo and is inactive otherwise. I've also changed the way we execute (using simple-git which handles the processes better than we currently do) so hopefully that will resolve its current behaviour