Open morrownr opened 1 year ago
@p7x404
I have recently seen 3 other reports with other drivers that involve the nvidia driver. I don't have an nvidia card but don't need one to install the driver and test this problem so if you could kindly give me a url to where I can download the nvidia driver, I would appreciate it.
Also, can you give me a link to the rtl8852bu, 1.15.11 driver that you had installed?
I have got to see what is going on and rework the code to account for this problem. It seems to me that something the nvidia driver installation is doing is interfering with what I am doing.
Thanks
The 8852bu driver is the one from brostrend (installed by following the procedure here : https://linux.brostrend.com/)
The nvidia driver is installed using a regular deb from the ubuntu repository. I should be able to provide you with more details after having a look on my machine...
The machine where I use your driver is running Ubuntu 20.04 LTS, with hwe kernel.
@p7x404
The machine where I use your driver is running Ubuntu 20.04 LTS, with hwe kernel.
Which hwe kernel?
$ uname -r
The nvidia driver is installed using a regular deb from the ubuntu repository.
Yes, please provide more details when able. I looked around using synaptic and there seem to be hundreds of nvidia packages. I'd prefer to have the exact deb you are using.
Something is wrong and appreciate your help.
Kernel : 5.15.0-78-generic base nvidia pkg is nvidia-driver-535 => nvidia-driver-535 535.86.05-0ubuntu0.20.04.2 amd64 NVIDIA driver metapackage It is downloaded from the ubuntu/focal-updates / ubuntu/focal-security repositories (in restricted)
There are several dependencies (is , notably the nvidia-dkms-535 package). Here the complete list, of this little nightmare...
ii libnvidia-cfg1-535:amd64 535.86.05-0ubuntu0.20.04.2 amd64 NVIDIA binary OpenGL/GLX configuration library
ii libnvidia-common-535 535.86.05-0ubuntu0.20.04.2 all Shared files used by the NVIDIA libraries
ii libnvidia-compute-535:amd64 535.86.05-0ubuntu0.20.04.2 amd64 NVIDIA libcompute package
ii libnvidia-decode-535:amd64 535.86.05-0ubuntu0.20.04.2 amd64 NVIDIA Video Decoding runtime libraries
ii libnvidia-encode-535:amd64 535.86.05-0ubuntu0.20.04.2 amd64 NVENC Video Encoding runtime library
ii libnvidia-extra-535:amd64 535.86.05-0ubuntu0.20.04.2 amd64 Extra libraries for the NVIDIA driver
ii libnvidia-fbc1-535:amd64 535.86.05-0ubuntu0.20.04.2 amd64 NVIDIA OpenGL-based Framebuffer Capture runtime library
ii libnvidia-gl-535:amd64 535.86.05-0ubuntu0.20.04.2 amd64 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii linux-objects-nvidia-535-5.15.0-78-generic 5.15.0-78.85~20.04.1+1 amd64 Linux kernel nvidia modules for version 5.15.0-78 (objects)
ii linux-signatures-nvidia-5.15.0-78-generic 5.15.0-78.85~20.04.1+1 amd64 Linux kernel signatures for nvidia modules for version 5.15.0-78-generic
ii nvidia-compute-utils-535 535.86.05-0ubuntu0.20.04.2 amd64 NVIDIA compute utilities
ii nvidia-dkms-535 535.86.05-0ubuntu0.20.04.2 amd64 NVIDIA DKMS package
ii nvidia-driver-535 535.86.05-0ubuntu0.20.04.2 amd64 NVIDIA driver metapackage
ii nvidia-firmware-535-535.86.05 535.86.05-0ubuntu0.20.04.2 amd64 Firmware files used by the kernel module
ii nvidia-kernel-common-535 535.86.05-0ubuntu0.20.04.2 amd64 Shared files used with the kernel module
ii nvidia-kernel-source-535 535.86.05-0ubuntu0.20.04.2 amd64 NVIDIA kernel source package
ii nvidia-prime 0.8.16~0.20.04.2 all Tools to enable NVIDIA's Prime
ii nvidia-settings 470.57.01-0ubuntu0.20.04.3 amd64 Tool for configuring the NVIDIA graphics driver
ii nvidia-utils-535 535.86.05-0ubuntu0.20.04.2 amd64 NVIDIA driver support binaries
ii screen-resolution-extra 0.18build1 all Extension for the nvidia-settings control panel
ii xserver-xorg-video-nvidia-535 535.86.05-0ubuntu0.20.04.2 amd64 NVIDIA binary Xorg driver
Okay. Got it.
nvidia-driver-535 (metapackage)
Now I have all of the possible guilty parties. It is a matter of finding the time to work this. Some projects take a lot of time and this may be one but it needs to be done.
I think I have a way to speed this up. I installed the nvidia driver and will now just continue on with the projects on my to-do list and will see if the problem makes itself obvious.
Test results after installing nvidia-driver-535 ...
~$ dkms status nvidia/535.86.05, 6.5.0-060500rc5-generic, x86_64: installed rtl8852bu/1.19.3, 6.5.0-060500rc5-generic, x86_64: installed
$ sudo sh install-driver.sh [sudo] password for morrow: : --------------------------- : install-driver.sh v20230718 : x86_64 (system architecture) : x86_64 (gcc architecture) : 4/4 (in-use/total processing units) : 12199716 (total system memory) : 6.5.0-060500rc5-generic (kernel version) : gcc (Ubuntu 13.1.0-2ubuntu2~23.04) 13.1.0 : dkms-3.0.10 : SecureBoot disabled : ---------------------------
Checking for previously installed drivers. Module rtl8852bu-1.19.3 for kernel 6.5.0-060500rc5-generic (x86_64). Before uninstall, this module version was ACTIVE on this kernel.
8852bu.ko.zst:
Starting installation. Installing 8852bu.conf to /etc/modprobe.d The dkms installation routines are in use. Copying source files to /usr/src/rtl8852bu-1.19.3 Creating symlink /var/lib/dkms/rtl8852bu/1.19.3/source -> /usr/src/rtl8852bu-1.19.3 The driver was added to dkms successfully. : ---------------------------
Sign command: /usr/bin/kmodsign Signing key: /var/lib/shim-signed/mok/MOK.priv Public certificate (MOK): /var/lib/shim-signed/mok/MOK.der
Building module: Cleaning build area... ./dkms-make.sh..................................... Signing module /var/lib/dkms/rtl8852bu/1.19.3/build/8852bu.ko Cleaning build area... Compile time: 370.83 seconds The driver was built by dkms successfully. : ---------------------------
8852bu.ko.zst: Running module version sanity check.
Info: Update this driver with the following commands as needed:
$ git pull $ sudo sh install-driver.sh
Note: Updates to this driver SHOULD be performed before distro upgrades such as Ubuntu 23.10 to 24.04. Note: Updates can be performed as often as you like. It is recommended to update at least every 2 months. Note: Work on this driver, like the Linux kernel, is continuous.
Enjoy!
Do you want to edit the driver options file now? (recommended) [Y/n] Do you want to apply the new options by rebooting now? (recommended) [Y/n]
Operating as it should. Need to step back and think...
Looking back at the original bug report showing the errors:
Checking for previously installed drivers.
Error! Could not locate dkms.conf file.
File: /var/lib/dkms/nvidia/430.64/source/dkms.conf does not exist.
Why would my script care about the nvidia dkms.conf not existing? It has to be dkms posting the errors so what is telling dkms to use the nvidia fkms.conf. Why?
$sudo dkms status -c dkms.conf
nvidia, 430.64: added
nvidia, 535.86.05: added
rtl8852bu, 1.15.11, 5.15.0-76-generic, x86_64: installed
rtl8852bu, 1.19.3, 5.15.0-78-generic, x86_64: installedError! Could not locate dkms.conf file.
File: /var/lib/dkms/rtl8852bu/1.19.3/source/dkms.conf does not exist.
Why would the nvidia drivers only be added and not installed? Why are there two nvidia drivers added? Why does dkms think the dkms.conf for rtl8852bu does not exist?
It appears there is an external cause of the missing dkms.conf files? The rtl8852bu dkms.conf file is not missing, so what makes it appear to be missing? Is there a variable wrongly set somewhere?
Many questions, few answers.
@p7x404 @alkisg
If either of you come up with an idea what to look for, please let me know. What happened to cause the original bug report should not happen but it happened. It may not exist anymore because of the cleanup.
Maybe looking around with? (please post the contents)
$printenv $ export -p
Could the previously installed rtl8852bu 1.15.* driver have done something that is causing this? I have not installed it yet as I need a deb of it.
Is this current issue reproducible, or did it just happen once and now it's gone?
This command shows if any .deb packages are still installed:
dpkg -l | grep 'rtl8.*-dkms'
If there are, you can uninstall them with e.g.:
apt purge rtl8852bu-dkms
But in general, I've only seen the "missing dkms.conf" messages when users manually removed their whole directories, instead of using packages or installers...
OK, I dug a bit on my side.
The problem resides of the remove-driver.sh script. I fixed it by forcing the dkms.conf to use (the one coming from the repository) by adding the -c option :
if command -v dkms >/dev/null 2>&1; then
echo "Removing a dkms installation."
dkms status -c dkms.conf | while IFS=" ,:/" read -r modname modver _dummy; do
case "$modname" in *${MODULE_NAME})
dkms remove -m "$modname" -v "$modver" -c dkms.conf --all
esac
done
With this modification, the dkms driver is uninstalled and I can successfully launch install-driver afterwards.
This is probably a quick and dirty fix, but it works. The error with the old nvidia driver is probably like alkisg said a package that didn't uninstall correctly (but I didn't do it manually, that's for sure, been using linux long enough to know that messing with files managed by packages is a big no-no :-) ).
Is this current issue reproducible, or did it just happen once and now it's gone?
I haven't been able to reproduce it. I'm going to take slow stroll down the path of relooking at install-driver.sh line by line as I have time.
But in general, I've only seen the "missing dkms.conf" messages when users manually removed their whole directories, instead of using packages or installers...
Same here. I have had a hard time fully wrapping my mind around what happened in this case in that we saw the nvidia driver and my driver both complaining.
With this modification...
You have given me an idea to harden the script a little. Thanks.
Since we don't seem to have any emergencies, I take my time and step through the scripts to see if there is anything that might have triggered this.
Appreciate the help.
@p7x404
I think I just merged the final part of the fix for this problem.
@morrownr :
Thanks, I successfully tested it.
Take care
@p7x404 said: I had to manually remove the driver from DKMS to be able to install the update.
$sudo dkms remove -c dkms.conf -m rtl8852bu/1.19.3 --all