Open AlexDaniel opened 4 months ago
Note that previously I had CPU usage issues related to Vulkan, but now they're resolved. See https://github.com/zed-industries/zed/issues/13154 for context. Right now, Zed behaves correctly for me, and is fast, just not blazing fast :)
So did the input also get faster when the CPU issues got resolved?
@mrnugget yes, significantly! However, I did not measure/see the framerate previously (I can check if you tell me how), it is possible that with the compatibility renderer it was simply taking half a second or whatever to render the next frame, which is why it was so slow.
You can try ZED_MEASUREMENTS=1 zed --foreground
.
I'm wondering: should we close the ticket then? I've been digging into input latency a lot lately and I don't see any obvious things to make it faster. I suspect that XIM might make it a bit slower, but trying that out requires building from source.
Okay, with zed running correctly with vulkan, it's like this:
frame duration: 9.389796ms
frame duration: 5.361027ms
frame duration: 9.417434ms
frame duration: 5.335109ms
frame duration: 9.525249ms
frame duration: 14.980214ms β here's where I pressed the Right button
frame duration: 14.559212ms
frame duration: 15.588432ms
frame duration: 15.081285ms
frame duration: 6.376729ms
frame duration: 3.066679ms
frame duration: 4.300767ms
I'm wondering: should we close the ticket then? I've been digging into input latency a lot lately and I don't see any obvious things to make it faster. I suspect that XIM might make it a bit slower, but trying that out requires building from source.
Close why? If alacritty can output frames faster (at least to keyboard responses), then why not zed?
What's the refresh rate of your display? If it's 60Hz, then we aren't skipping frames and you're seeing the input in the next frame, right?
@mrnugget let's take a step back to the original report and reiterate.
Zed, unlike other editors and apps, is skipping 2 frames before rendering the result of a cursor movement.
This could indeed be due to some library, or event loop lag, or slow rendering, or whatever, but it's clearly something that can be improved considering that other editors are not having this issue.
Here are my measurements of other editors. In some of these I pressed the β¬ οΈ Left arrow instead.
Let me know if there's anything else that you'd be interested in testing.
Thanks @AlexDaniel! I'll take a look. I'll get back to you if I need more information. Would you be willing to compile & run Zed from source? (You'll need to compile in release mode -- cargo build --release
-- which is quite the CPU killer).
Also: how do you record these? I'm curious :)
@mrnugget I'm running from source already, no problem, just let me know what to checkout. I didn't know there was any other option at this point π
As for recording: I use OBS to record the screen at 60 fps. I put glxgears in the same frame to be 100% sure that OBS didn't skip any frames (it never does, but being able to see it instills confidence). Then I use mpv
to inspect the result frame by frame (.
,
to move by frames, s
to save the frame). Then there's just some imagemagick trickery to add numbers to frames and make a gif out of it. :)
I looked into this a little bit. One thing that seems to be a problem is that we send key-presses via the XIM protocol to handle compose keys, etc. On my machine that can take anywhere from 2-20ms. In a release build it's more often on the 2 end of the spectrum, but on a debug build it's frequently at the 20ms end (and this is on much more modern hardware).
It's not yet clear if this is how long it's taking for XIM to respond, or whether we're extending the time because we're contending on the event loop.
If you change this line from if xim_connected {
to if false {
does this cause any noticeable change for you?
@ConradIrwin I have changed that line as you suggested, rebuilt zed (wow that took a while) and tested it again. I don't see any difference. zed is skipping 1-2 frames just like before. I wish there was some way to measure end-to-end delay apart from the way I'm measuring frames, as that'd be able to show if this change was an improvement (even if not big enough).
Just as a thought, how is double buffering implemented in the linux version? Could it be that the keystrokes are handled quickly (which is likely the case), and the frames are rendered fast (as shown by frame durations), but it's just that zed is writing into a buffer that's going to be shown to the user after the current (already stale) one? That'd explain being off by at least 1 frame at all times.
I can't get OBS to reliably record at 60fps. (Even tweaked the nvidia driver settings). Or at least that's my explanation for the following.
But look at this video:
Screencast from 2024-06-20 12-18-35.webm
That's me stepping through a recording, frame by frame. See the time at the bottom left.
You can see that the first cursor movement takes 1 frame in the recording, but "4 frames" in the console output of Zed.
Then later in the recording, there's other keystrokes that are immediately rendered in the next frame of Zed. You can see the keypress event being printed and the keypress being shown in the next render call.
Another data point: I logged when the scene is marked as dirty and needs to be drawn (and not just presented).
You can see here that after the keypress the immediately next frame is marked as dirty and draw & present
is printed. the frame after that is only present
.
All of this makes me wonder whether the problem is that we do render at 60fps and do not miss a frame, but that we're not aligned with the recordings, etc.?
@mrnugget So, to summarize, on your recording, Zed is almost always skipping 1 frame except for the last case at the very end of the video. Is that correct?
Following the same description above:
I think the confusing part is that you see something in the terminal and then one might think that if Zed renders anything after seeing it in the terminal, then that's acceptable. However, it took your terminal at least 1 frame to show the result, and if at that frame Zed didn't render anything in the editor, then it's late.
Let me know if I got it wrong because that video is a bit confusing to me. π«£
I'm not sure, because it's hard to reliably record on which frame something shows up. According to the logs it's this:
But in the OBS recording, it looks like:
<recording frame starts>
<recording frame starts>
Here's a video of me jumping forward/backward a single frame in the recording (which should be 60fps!):
https://github.com/zed-industries/zed/assets/1185253/539ae5f9-c682-4dfb-9097-9073d11e772b
So, seems like my recording is not 60fps, more like 30fps, and not aligned with Zed's render loop.
Or, in other words: the logs show me that we don't skip a frame. We receive an input event, we handle it, we mark the window as dirty, we render.
The problem is that if X11 blasts us with events, then "handling input" might take too long, and we miss our next 16ms frame and are delayed.
But I can't see that it always takes 2 frames to render an input event.
This should help a lot with input latency: https://github.com/zed-industries/zed/pull/13355
Would be great if you can try it out!
I pulled from main and right now have this error when trying to start zed:
Zed failed to open a window: X11 error X11Error { error_kind: Value, error_code: 2, sequence: 115, bad_value: 0, minor_opcode: 0, major_opcode: 1, extension_name: None, request_name: Some("CreateWindow") }
Hmmmmmmmmm, that should be unrelated.
As far as I can tell, the error (error_kind: Value
) is this one:
BadValue | Some numeric value falls outside the range of values accepted by the request.
But we didn't change any values when opening a window.
Either way, I'm not sure how to test it now. π
Can you go on latest main
, revert the PR I merged (git revert f69c8ca74eb934f37d29cbe2f6ee31f94c7aaf63
) locally, then do cargo run
and see if that truly is the issue?
Yeah, same error, it's unrelated to your changes.
Interesting. Sorry for saying it, but: does a restart fix it?
(Also: I pushed some more debug information on main
for that error)
Yeah, I have tried restart already, it didn't help. Now with more information:
Zed failed to open a window: CreateWindow request to X server failed. depth: 24, x_window: 39845890, visual_set.root: 751, bounds.origin.x.0: 2, bounds.origin.y.0: 0, bounds.size.width.0: 0, bounds.size.height.0: 0
Caused by:
X11 error X11Error { error_kind: Value, error_code: 2, sequence: 115, bad_value: 0, minor_opcode: 0, major_opcode: 1, extension_name: None, request_name: Some("CreateWindow") }
@mrnugget OK, I see the issue.
I have replaced:
bounds.size.width.0 as u16,
bounds.size.height.0 as u16,
With:
200 as u16,
200 as u16,
And it started working. It seems that it was trying to open a window with 0 width and 0 height, which is not allowed.
@mrnugget OK, now that I'm able to run Zed, I was able to test it. Unfortuntaley, I see no improvement.
Moreover, I immediately see 100% CPU usage issue that's reported in this ticket: https://github.com/zed-industries/zed/issues/13409
I just want to pile onto this to say I am seeing a similar behavior. When trying Zed, the slow cursor was the first thing I noticed. I am running a MacBook Pro with M1 Pro, and VS Code's cursor feels noticeably faster when clicking around in some code. The difference is slight, but very much noticeable in a direct comparison. Would love to see this fixed!
Just leaving this here because it was discussed whether the issue should be closed, and I really think it shouldn't!
I just want to pile onto this to say I am seeing a similar behavior. When trying Zed, the slow cursor was the first thing I noticed. I am running a MacBook Pro with M1 Pro, and VS Code's cursor feels noticeably faster when clicking around in some code. The difference is slight, but very much noticeable in a direct comparison. Would love to see this fixed!
So are you talking about the macOS version or the Linux version?
And it started working. It seems that it was trying to open a window with 0 width and 0 height, which is not allowed.
Looking into that.
@mrnugget OK, now that I'm able to run Zed, I was able to test it. Unfortuntaley, I see no improvement.
That's very interesting, since we changed the way the loop works significantly and now very eagerly render.
Moreover, I immediately see 100% CPU usage issue that's reported in this ticket: #13409
Looking into it. Might have to revert the PR if I can reproduce.
I just want to pile onto this to say I am seeing a similar behavior. When trying Zed, the slow cursor was the first thing I noticed. I am running a MacBook Pro with M1 Pro, and VS Code's cursor feels noticeably faster when clicking around in some code. The difference is slight, but very much noticeable in a direct comparison. Would love to see this fixed!
So are you talking about the macOS version or the Linux version?
Apologies, I am running the ARM build of Zed on macOS on version 0.140.5. Let me know if I can help narrow this down somehow.
Apologies, I am running the ARM build of Zed on macOS on version 0.140.5. Let me know if I can help narrow this down somehow.
It'd be great if you could create a separate ticket for this, since macOS, Linux/Wayland, Linux/X11, and Windows have different implementations to render Zed and to receive input. So, while there is a possibility that there's an underlying problem shared by all platform implementations, for now I'd try to keep things separate.
As for narrowing it down: a video recording in which VS Code and Zed are compared wrt to input latency would be good.
@kvark sorry to ping you! But I'm wondering whether you have any thoughts on the whole topic of input latency and whether it is/isn't/could be/could not be related to how blade talks to the GPU.
I pulled the latest changes from main and rebuilt zed. Now, Zed starts up without crashing, and there's no 100% CPU problem anymore. But the cursor movement is still delayed by 1-2 frames.
Does "delayed by 1-2 frames" mean that it's similar to Alacritty/Emacs then and sometimes does an additional frame? Or do you mean it's 1+ frame behind them?
In any case: I'm not sure how to reproduce the video recording and/or fix this yet. I know that we mark the window as dirty and render the change on the next frame.
@mrnugget alacritty and emacs are the benchmark. If it was possible to have them all side by side, then:
Frame | Description | Alacritty/Emacs/Pluma | VS Code | Zed |
---|---|---|---|---|
Frame 0 | You press the keyboard button | |||
Frame 1 | You should see the result now | The result is visible | But not in VS Code | And not in Zed |
Frame 2 | Here it's late by 1 frame | VS Code renders the result here consistently | Sometimes, Zed renders here | |
Frame 3 | Here it's late by 2 frame | If not, then Zed renders here |
The distribution between being late by 1 and 2 frames is roughly speaking 50/50.
Thanks for summoning me! I agree it's an important matter, and Zed should be as good as the best examples out there in principle.
I checked the frame publishing, and it seems reasonable:
@AlexDaniel as with any graphics issues, could you share a log file from your machine made with RUST_LOG=blade_graphics=info
environment? I want to double-check if your machine is even using MAILBOX
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::init] Enabling Vulkan Portability
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::init] Enabling color space support
MESA-INTEL: warning: Ivy Bridge Vulkan support is incomplete
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::init] Adapter "Intel(R) HD Graphics 4000 (IVB GT2)"
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::init] No ray tracing extensions are supported
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::init] Using surface present mode MAILBOX
[2024-06-27T12:35:38+03:00 WARN blade_graphics::hal::init] Unable to forbid exclusive full screen
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::descriptor] Creating a descriptor pool for at most 16 sets
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::descriptor] Creating a descriptor pool for at most 16 sets
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::resource] Creating texture 0x56305c1898a0 of size 1024x1024x1 and format R8Unorm, name 'atlas', handle 0
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::resource] Creating buffer 0x56305c18a510 of size 65536, name 'chunk-0', handle 1
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::resource] Creating buffer 0x56305c260280 of size 4096, name 'chunk-0', handle 2
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::init] Using surface present mode MAILBOX
[2024-06-27T12:35:38+03:00 WARN blade_graphics::hal::init] Unable to forbid exclusive full screen
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::resource] Creating buffer 0x56305bf13a30 of size 19488, name 'chunk-1', handle 3
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::resource] Creating buffer 0x56305c78c230 of size 4096, name 'chunk-2', handle 4
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::resource] Creating buffer 0x56305c180530 of size 19488, name 'chunk-3', handle 5
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::resource] Creating buffer 0x56305c16ba40 of size 23184, name 'chunk-4', handle 6
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::resource] Creating buffer 0x56305c6b2550 of size 58464, name 'chunk-5', handle 7
[2024-06-27T12:35:38+03:00 INFO blade_graphics::hal::resource] Creating buffer 0x56305ccce670 of size 4096, name 'chunk-6', handle 8
[2024-06-27T12:35:39+03:00 INFO blade_graphics::hal::resource] Creating buffer 0x56305cbb4800 of size 58464, name 'chunk-7', handle 9
[2024-06-27T12:35:39+03:00 INFO blade_graphics::hal::resource] Creating buffer 0x56305c4c61d0 of size 65536, name 'chunk-1', handle 10
[2024-06-27T12:35:39+03:00 INFO blade_graphics::hal::resource] Creating buffer 0x56305bec45c0 of size 39760, name 'chunk-8', handle 11
[2024-06-27T12:35:39+03:00 INFO blade_graphics::hal::resource] Creating buffer 0x56305ccce040 of size 58464, name 'chunk-9', handle 12
[2024-06-27T12:35:39+03:00 INFO blade_graphics::hal::resource] Creating buffer 0x56305c183500 of size 17920, name 'chunk-10', handle 13
I looked at this a bit, tried to see the sequence of events in the code that goes from they key press to screen. You can see the changes in https://github.com/kvark/ Attaching example output: input-log.txt zed/commit/81be4900dfb80fe20d4487a2e85fee9778d80e8e
The hypothesis is that, if we have latency introduced by the logic, we'd see something like "key event, present, draw, present", i.e. some frames presented before they are built/drawn with the updated content (that is based on the key). Here is what I see in the output:
?? physical press
?? calloop getting signal
[2024-07-02T00:39:31.086610691-07:00 WARN gpui::platform::linux::wayland::client] event(key pressed)
[2024-07-02T00:39:31.086660636-07:00 WARN gpui::window] dispatch_event(keyboard)
// there is a pause around 14.5ms here, what is it triggered by?
[2024-07-02T00:39:31.101265109-07:00 WARN gpui::window] cx.draw+pesent
[2024-07-02T00:39:31.101355591-07:00 WARN gpui::window] draw
// 20ms spent building a frame (in debug mode)
[2024-07-02T00:39:31.121502790-07:00 WARN gpui::platform::blade::blade_renderer] draw
?? actual presentation by Vulkan
// cursor blink after 130ms?
[2024-07-02T00:39:31.134650619-07:00 WARN gpui::window] cx.draw+pesent
[2024-07-02T00:39:31.134733917-07:00 WARN gpui::window] draw
[2024-07-02T00:39:31.152328695-07:00 WARN gpui::platform::blade::blade_renderer] draw
[2024-07-02T00:39:31.157895026-07:00 WARN gpui::window] dispatch_event(keyboard)
[2024-07-02T00:39:31.166715483-07:00 WARN gpui::window] cs.present
I don't see any skipped frames in this part of the stack. I see a pause of 14.5 ms before we get to the refresh callback. On X11 that would be OK-ish, since the refresh timer is set at 60fps (most likely), so this delay would be around 8ms in average and 16ms at most. On Wayland I didn't dig into the event loop as much, yet. Any help/insight would be welcome!
There could also be gaps before these logs (i.e. raw input to when WL client gets it) and after (Vulkan submission to actual presentation). For the second part, I'm going to add tracking - there are ways to get the exact timing of presentation events out of the driver. I'm thinking of just exposing this as profiling::event
and then looking at a trace to see where things landed.
@AlexDaniel as for your log, it looks normal. The only suspicious part is "MESA-INTEL: warning: Ivy Bridge Vulkan support is incomplete", which could mean, for example, that their MAILBOX
support isn't working properly. But I don't know for sure, and I have a feeling it's mostly complete.
Just FYI, after the recent switch to mio and so on, there is no improvement. The cursor movement has the same delay as before.
However, perceptually, it feels like scrolling is now worse. I'll double checkβ¦
@mrnugget It seems there might be an issue with the recent versions of Zed, possibly related to the run loop.
In the gifs below I show that sometimes there's a huge delay between clicking on a file and it getting opened. And it's not just that, all interactions with zed feel delayed a bit (hovering over buttons, scrolling, etc.). It's the most visible when opening files, but I think everything is affected.
Old run loop (eb845ee20) | New run loop (79f364632) |
---|---|
@AlexDaniel I cannot reproduce this on my end. Opening files is immediate.
Am I seeing this correctly, that clicks are rendered right away (the blue border showing up), but then it takes a while until the file is loaded?
In the range you provided there's eb845ee..79f3646 there's 60 commits. Can you try out two specific commits?
@AlexDaniel I added some trace-level logging here: https://github.com/zed-industries/zed/pull/13928 -- If you run Zed with this commit included (don't forget --release
) and with RUST_LOG=gpui=trace
-- does anything stand out when you click on a button in the project panel?
Yes, you noticed correctly, the clicks are immediate but the file appears significantly later. If you look even closer, you'll see that the hover highlighting of buttons is also delayed (the mouse moves over the button, but it gets highlighted later). I'll test the commits a bit later today.
@AlexDaniel I've reverted most of the changes from the mio branch for now. It turned out that under high system load, we were falling behind on the runnable queue, leading to this kind of behaviour
We're now mostly back to where we were before, but with some extra load shedding if we're getting keyrepeat events faster than we can keep up.
Still more work to do, but I think we need to do a more holistic render performance pass
@mrnugget, @ConradIrwin I have tested it on 64755a7ae, and I can't reproduce it there. Meanwhile, I can reliably reproduce the issue on 79f364632535ef85c190fcf1be9834bbffc19c95. So the issue appears after the mio loop change (and not because of it). I can bisect it further, please let me know if that's important.
However, there's no problem on the latest zed (d32e9f759c289a), so I guess the issue is irrelevant now, unless you want to know the real cause.
I looked at this a bit, tried to see the sequence of events in the code that goes from they key press to screen. You can see the changes in https://github.com/kvark/ Attaching example output: input-log.txt zed/commit/81be4900dfb80fe20d4487a2e85fee9778d80e8e
The hypothesis is that, if we have latency introduced by the logic, we'd see something like "key event, present, draw, present", i.e. some frames presented before they are built/drawn with the updated content (that is based on the key). Here is what I see in the output:
?? physical press ?? calloop getting signal [2024-07-02T00:39:31.086610691-07:00 WARN gpui::platform::linux::wayland::client] event(key pressed) [2024-07-02T00:39:31.086660636-07:00 WARN gpui::window] dispatch_event(keyboard) // there is a pause around 14.5ms here, what is it triggered by? [2024-07-02T00:39:31.101265109-07:00 WARN gpui::window] cx.draw+pesent [2024-07-02T00:39:31.101355591-07:00 WARN gpui::window] draw // 20ms spent building a frame (in debug mode) [2024-07-02T00:39:31.121502790-07:00 WARN gpui::platform::blade::blade_renderer] draw ?? actual presentation by Vulkan // cursor blink after 130ms? [2024-07-02T00:39:31.134650619-07:00 WARN gpui::window] cx.draw+pesent [2024-07-02T00:39:31.134733917-07:00 WARN gpui::window] draw [2024-07-02T00:39:31.152328695-07:00 WARN gpui::platform::blade::blade_renderer] draw [2024-07-02T00:39:31.157895026-07:00 WARN gpui::window] dispatch_event(keyboard) [2024-07-02T00:39:31.166715483-07:00 WARN gpui::window] cs.present
I don't see any skipped frames in this part of the stack. I see a pause of 14.5 ms before we get to the refresh callback. On X11 that would be OK-ish, since the refresh timer is set at 60fps (most likely), so this delay would be around 8ms in average and 16ms at most. On Wayland I didn't dig into the event loop as much, yet. Any help/insight would be welcome!
There could also be gaps before these logs (i.e. raw input to when WL client gets it) and after (Vulkan submission to actual presentation). For the second part, I'm going to add tracking - there are ways to get the exact timing of presentation events out of the driver. I'm thinking of just exposing this as
profiling::event
and then looking at a trace to see where things landed.
I believe the Wayland implementation only initiates rendering when it receives a frame callback from the server, which for most Wayland compositor impls is after vblank (some compositors like hyprland try to be smarter and send it a bit later to reduce latency). That could explain the around ~14ms pause between key press->draw. Do other platform impls of Zed do this differently and start rendering as soon as a key press is received?
Do other platform impls of Zed do this differently and start rendering as soon as a key press is received?
No, on X11 we send Expose requests to the server on a fixed interval (60fps) and then render/present when receiving the expose response back (or whenever else server sends it).
On macOS we use CVDisplayLink and render when the OS tells us to so we align with vsync.
On macOS we use CVDisplayLink and render when the OS tells us to so we align with vsync.
This doesn't actually make you render the screen in sync with vsync. All we are doing is starting the rendering when CVDisplayLink
fires. Then we spend N milliseconds recording a frame, then M milliseconds executing it on GPU, and only then at some point there is a display refresh that picks up our rendered frame. I.e. if you are on a 60Hz display (a typical case), then the difference between CVDisplayLink
-driven rendering and just a good 60fps timer is likely to be imperceptible.
What you'd ideally do is to live-estimate how much time (N+M) the app will take to present a frame, and then somehow wake up / start rendering ahead of this time before the actual present, plus some extra time. But that's not what CVDisplayLink
is doing.
Here's the description of what happens, frame by frame:
xinput test 9
β 9 is the id of my keyboard that I found withxinput list
). Note that I'm usingalacritty
.It seems that
zed
is not immediate enough (at least 2 frames/33ms slower than it could be). Personally, I can feel that difference when comparing to e.g.alacritty
, and I thinkzed
can do better, especially considering its goal to become the faster editor.Just FYI, in some cases Zed responds on Frame 2, but never on Frame 1.
Here's the GIF with these 4 frames:
Just in case, all 4 frames separately:
Environment
This laptop is relatively old (Thinkpad x230 from 2012 or so). It is possible that on newer systems the experience is different. I can probably test the same in other editors, if you're interested, but I think that stock emacs is immediate (though I haven't measured yet, going by feel).