Open pravinxor opened 1 month ago
Hi there. Are you certain about this bit:
Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver. [x] I confirm that this does not happen with the proprietary driver package.
I know it's easier to just tick that box than to report the bug to linux-bugs@nvidia.com or on the forums, but what you are effectively saying is that the bug is in the kernel modules (plausible) and that it is in the delta between Open and Proprietary. That delta in 555.xx is very very tiny, so I find it extremely unlikely. Please double-check, otherwise kernel engineers who monitor this tracker (which is for kernel module issues only) could waste time looking in the wrong place.
PS, it seems like in your testing you installed the old kernel module, but still kept the new userspace. This can cause all sorts of issues, so best get that fixed:
May 21 23:39:19 zephyrus kernel: NVRM: API mismatch: the client has the version 555.42.02, but
NVRM: this kernel module has the version 550.78. Please
NVRM: make sure that this kernel module and all NVIDIA driver
NVRM: components have the same version.
Thanks for getting back, sorry about the mismatch between the userspace and kernel drivers- I've sorted that out, so that they're both on the same version, however the error still occurs. As for whether this is specific to the open kernel modules, I can confirm that the proprietary does work correctly. I've attached 2 sets of log files (open and proprietary kernel modules). Each set includes an nvidia bug report log, as well as a report from chromium. I'm happy to provide other information or perform debugging as well, if you believe it could help. about-gpu-open.txt about-gpu-proprietary.txt nvidia-bug-report-open.log.gz nvidia-bug-report-proprietary.log.gz
Thanks for double-checking. That is very surprising to me, I don't see anything in the logs suggesting any meaningful difference (except maybe some external monitor unplugging - was the test for both with the same monitors attached).
We'll try to repro this internally. It's very concerning that there's a functional difference here. Thanks!
Between the two tests I most recently posted, the display configuration was exactly the same. However between the recent two tests and the first test I posted, one of the attached displays was different. Though, I don't believe this is a significant factor, since the issue occurs regardless of the displays configuration.
I just wanted to update this thread with a small change that has happened between then and now. The log messages from EGL appear a little different.
about-gpu-2024-06-26T18-24-54-455Z.txt nvidia-bug-report.log.gz
NVIDIA Open GPU Kernel Modules Version
555.42.02
Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.
Operating System and Version
Arch Linux
Kernel Release
Linux 6.9.1-hardened1-1-hardened #1 SMP PREEMPT_DYNAMIC Mon, 20 May 2024 12:54:08 +0000 x86_64 GNU/Linux
Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.
Hardware: GPU
GPU 0: NVIDIA GeForce RTX 4060 Laptop GPU (UUID: GPU-57e1b957-4845-a325-50fb-12cb069295cd)
Describe the bug
When starting Chromium (or any chromium based program) using the
--ozone-platform=wayland
flag, the GPU process for Chromium cannot start, thus causing hardware acceleration to be completely unavailable- even if the browser is not tasked with performing the hardware acceleration on the Nvidia GPU.Relevant parts of the Chromium event log:
To Reproduce
--ozone-platform=wayland
flag, so that chromium is running as a native wayland app (and not via Xwayland)Note: hardware acceleration is active and performs correctly when Chromium is running via XWayland
Bug Incidence
Always
nvidia-bug-report.log.gz
nvidia-bug-report.log.gz
More Info
No response