Closed sandangel closed 2 years ago
This is a bug in the GPU driver.
Hi @kovidgoyal,
Unfortunately I'm running into the same issue. You mention that it's a bug in the GPU driver (vmwgfx) and I'd like to find out what I can do to get it resolved. I guess that @sandangel and I are basing our setup on the same example, set by Mitchell Hashimoto: https://github.com/mitchellh/nixos-config
I've tried different terminals (rxvt-unicode, wezterm, alacritty) and they all seem to work fine with showing the visual mouse selection. Could you help me point to the code in Kitty where the interaction with the GPU driver might fail? Is Kitty using a particular method that the other editors are not (as far as you know of course).
Any help in pointing me in the right direction for further investigation is highly appreciated and just wanted to say that I really enjoy using Kitty and appreciate the effort that you continue to show in developing and supporting Kitty.
There is nothing special in how visual selections are displayed. They are done by simply changing the background and foreground colors in the selected cells. If you want to trace the code look in shaders.c and cell_vertex.glsl
I had a debugging session with @SpexGuy and we tracked down the issue to the fact that the driver doesn't support single byte vertex attributes. Changing is_selected
to be a GL_UNSIGNED_INT
makes everything work again.
If I understand correctly this is not something unique to this driver (Martin knew to look for this issue because it's also the case with other GPUs), but it's clearly not super common if hasn't been reported in other issues (searched briefly, couldn't find any reference).
Here's a patch to fix the problem (also includes a couple changes to shell.nix
that I needed on NixOS 22.11). While the patch does indeed fix the problem it does also increase memory consumption, which is a complete waste for machines that do support single byte vertex attributes. If @kovidgoyal is interested in looking for a solution that works for everybody (be it some build flag or reworking the approach to rendering selected cells), great, otherwise I guess this is just just a patch for those of us who are affected by the bug.
In this second case maybe it would make sense for @mitchellh to consider tweaking mitchellh/nixos-config
to make use of it (only the C diff, the build script used by nix packages is already correct). A few people would certainly be appreciative of that :^)
From 06e3b236cd59b4ca1d4f29618fca1f01dcd9a991 Mon Sep 17 00:00:00 2001
From: Loris Cro <kappaloris@gmail.com>
Date: Tue, 17 Jan 2023 13:03:42 +0100
Subject: [PATCH] vmware 3d-accel compat
---
kitty/screen.c | 4 ++--
kitty/shaders.c | 4 ++--
shell.nix | 2 ++
3 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/kitty/screen.c b/kitty/screen.c
index e52dd98f..7e7d0a1f 100644
--- a/kitty/screen.c
+++ b/kitty/screen.c
@@ -2391,12 +2391,12 @@ iteration_data_is_empty(const Screen *self, const IterationData *idata) {
}
static void
-apply_selection(Screen *self, uint8_t *data, Selection *s, uint8_t set_mask) {
+apply_selection(Screen *self, uint32_t *data, Selection *s, uint8_t set_mask) {
iteration_data(self, s, &s->last_rendered, -self->historybuf->count, true);
for (int y = MAX(0, s->last_rendered.y); y < s->last_rendered.y_limit && y < (int)self->lines; y++) {
Line *line = visual_line_(self, y);
- uint8_t *line_start = data + self->columns * y;
+ uint32_t *line_start = data + self->columns * y;
XRange xr = xrange_for_iteration(&s->last_rendered, y, line);
for (index_type x = xr.x; x < xr.x_limit; x++) line_start[x] |= set_mask;
}
diff --git a/kitty/shaders.c b/kitty/shaders.c
index a4ae4e90..ca6634df 100644
--- a/kitty/shaders.c
+++ b/kitty/shaders.c
@@ -233,7 +233,7 @@ create_cell_vao() {
A1(colors, 3, GL_UNSIGNED_INT, fg);
add_buffer_to_vao(vao_idx, GL_ARRAY_BUFFER);
- A(is_selected, 1, GL_UNSIGNED_BYTE, NULL, 0);
+ A(is_selected, 1, GL_UNSIGNED_INT, NULL, 0);
size_t bufnum = add_buffer_to_vao(vao_idx, GL_UNIFORM_BUFFER);
alloc_vao_buffer(vao_idx, cell_program_layouts[CELL_PROGRAM].render_data.size, bufnum, GL_STREAM_DRAW);
@@ -399,7 +399,7 @@ cell_prepare_to_render(ssize_t vao_idx, ssize_t gvao_idx, Screen *screen, GLfloa
}
if (screen->reload_all_gpu_data || screen_resized || screen_is_selection_dirty(screen)) {
- sz = (size_t)screen->lines * screen->columns;
+ sz = (size_t)screen->lines * screen->columns * sizeof(uint32_t);
address = alloc_and_map_vao_buffer(vao_idx, sz, selection_buffer, GL_STREAM_DRAW, GL_WRITE_ONLY);
screen_apply_selection(screen, address, sz);
unmap_vao_buffer(vao_idx, selection_buffer); address = NULL;
diff --git a/shell.nix b/shell.nix
index 4e5744fd..5c731ae2 100644
--- a/shell.nix
+++ b/shell.nix
@@ -14,6 +14,7 @@ mkShell rec {
ncurses
lcms2
librsync
+ openssl.dev
] ++ optionals stdenv.isDarwin [
Cocoa
CoreGraphics
@@ -57,6 +58,7 @@ mkShell rec {
shellHook = if stdenv.isDarwin then ''
export KITTY_NO_LTO=
'' else ''
+ export KITTY_FONTCONFIG_LIBRARY='${fontconfig.lib}/lib/libfontconfig.so'
export KITTY_EGL_LIBRARY='${lib.getLib libGL}/lib/libEGL.so.1'
export KITTY_STARTUP_NOTIFICATION_LIBRARY='${libstartup_notification}/lib/libstartup-notification-1.so'
export KITTY_CANBERRA_LIBRARY='${libcanberra}/lib/libcanberra.so'
--
2.38.1
I would be pretty reluctant to increase resource consumption (both memory and CPU -> GPU bandwidth for everybody to workaround an issue with a single GPU driver (havent heard of this affecting any other GPU driver).
I would consider doing this at runtime based on GPU detection, since it is a pretty simple patch. If you can come up with some reasonably robust code that identifies when we are running on an affected system, I can easily adapt your patch to be conditioned on a global set at startup.
hi @kristoff-it , could you share with us your nix-config to apply your patch to fix this issue? I really appreciate it.
I just wanted to note that I reported this issue to someone with access to the VMware graphics driver team, so hopefully this can get on their radar.
Thanks for the ping on this @mitchellh We've (VMware) started investigating... The team notes that it would 'probably be really easy to fix' if we had an apitrace of the app that captures the problem (i.e. just record a trace on a system where this issue is happening). We could then add that apitrace file to our testing and make sure we never regress
Follow up, but we'll actually also take an apitrace of a working scenario... it'd be the same investigation for us... what are the calls being made. We'd then put that against our stack and see what fails.
Here's a trace from a working system. kitty.trace.zip
Hi @mikeroySoft , may I ask if there is any update on this issue?
This was fixed upstream by my colleague Charmaine Lee on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22568
The issue was in a CPU fallback, used to expand the GL_UNSIGNED_BYTE
array to GL_UNSIGNED_INT
. Mesa uses JIT generated assembly code on x86 for this, but plain C code on Arm, and the C code had a bug.
There's also an upcoming follow up change that will avoid the CPU fallback altogether.
@jrfonseca does it mean if we update mesa this issue will be fixed right? Coool. Not sure if it's released yet and which version this fix is included.
@sandangel, the fix was cherry-picked into staging/23.1
branch, so I expect it will be included on Mesa release 23.1, and subsequent releases.
Thanks a lot for sharing. Really appreciate it.
I would be pretty reluctant to increase resource consumption (both memory and CPU -> GPU bandwidth for everybody to workaround an issue with a single GPU driver (havent heard of this affecting any other GPU driver).
@kovidgoyal, for the record, there are a few other Mesa drivers which are essentially expanding the bytes to uints behind the scenes. Pretty much every driver which reports PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY==1
, namely:
d3d12
(Microsoft OpenGL driver for WSL)etnaviv
(Vivante)freedreno
(Adreno A2xx)nv30
r300
r600
Some of these are very old. And nobody sees the bug on x86 because Mesa uses a JIT assembly to do this on x86, which didn't had the bug.
Our fix applies to all the above. And we believe we can avoid the CPU fallback on svga
driver (Intel or Arm), but keep in mind some drivers will continue to do the byte to uint expansion behind the scenes.
OK, thanks.
Describe the bug The selection works (copy on select), but there is no visual effect.
To Reproduce Steps to reproduce the behavior:
Screenshots If applicable, add screenshots to help explain your problem.
Screen Recording 2022-08-20 at 13.58.28.mov
Environment details
Additional context Try to reproduce the problem with
kitty --config NONE
if you cannot then post a minimal kitty.conf that reproduces the problem. If the problem involves interaction with some other terminal program post a minimal config for that program to reproduce the problem as well.