pop-os / nvidia-graphics-drivers

Pop!_OS NVIDIA Graphics Drivers
141 stars 8 forks source link

Cannot login to system after upgrading Nvidia drivers to 510.54 #137

Open dlsniper opened 2 years ago

dlsniper commented 2 years ago

After upgrading to the latest nvidia drivers, I cannot see the login prompt anymore. I had to revert to the nouveau to get this working again.

$ uname -a
Linux monzi10 5.15.23-76051523-generic #202202110435~1644952300~21.10~96763f1 SMP Tue Feb 15 19:52:40 U x86_64 x86_64 x86_64 GNU/Linux

$ cat /etc/os-release
NAME="Pop!_OS"
VERSION="21.10"
ID=pop
ID_LIKE="ubuntu debian"
PRETTY_NAME="Pop!_OS 21.10"
VERSION_ID="21.10"
HOME_URL="https://pop.system76.com"
SUPPORT_URL="https://support.system76.com"
BUG_REPORT_URL="https://github.com/pop-os/pop/issues"
PRIVACY_POLICY_URL="https://system76.com/privacy"
VERSION_CODENAME=impish
UBUNTU_CODENAME=impish
LOGO=distributor-logo-pop-os

$ lspci -i 
01:00.0 VGA compatible controller: NVIDIA Corporation GP104BM [GeForce GTX 1070 Mobile] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: Micro-Star International Co., Ltd. [MSI] GP104BM [GeForce GTX 1070 Mobile]
        Flags: bus master, fast devsel, latency 0, IRQ 147
        Memory at de000000 (32-bit, non-prefetchable) [size=16M]
        Memory at c0000000 (64-bit, prefetchable) [size=256M]
        Memory at d0000000 (64-bit, prefetchable) [size=32M]
        I/O ports at e000 [size=128]
        Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Endpoint, MSI 00
        Capabilities: [100] Virtual Channel
        Capabilities: [250] Latency Tolerance Reporting
        Capabilities: [258] L1 PM Substates
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [420] Advanced Error Reporting
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Capabilities: [900] Secondary PCI Express
        Kernel driver in use: nouveau
        Kernel modules: nvidiafb, nouveau

Trying to install nvidia-driver-470, the previous driver, will result in it trying to use nvidia-driver-510. Please let me know if you need any other details.

leviport commented 2 years ago

470 will be installable again soon

dlsniper commented 2 years ago

Thanks for the update. I think not everything was reverted yet(?) as I can see still some 510 packages coming in after holding 470. Here's what a fresh install on the system gives me after using apt-mark hold nvidia-driver-470

The following NEW packages will be installed:
  libnvidia-common-510 linux-headers-5.15.23-76051523 linux-headers-5.15.23-76051523-generic
  linux-image-5.15.23-76051523-generic linux-modules-5.15.23-76051523-generic nvidia-kernel-common-510
The following packages have been kept back:
  libnvidia-cfg1-470 libnvidia-compute-470 libnvidia-compute-470:i386 libnvidia-decode-470
  libnvidia-decode-470:i386 libnvidia-encode-470 libnvidia-encode-470:i386 libnvidia-extra-470
  libnvidia-fbc1-470 libnvidia-fbc1-470:i386 libnvidia-gl-470 libnvidia-gl-470:i386 libnvidia-ifr1-470
  libnvidia-ifr1-470:i386 nvidia-compute-utils-470 nvidia-dkms-470 nvidia-driver-470 nvidia-kernel-source-470
  nvidia-utils-470 xserver-xorg-video-nvidia-470
leviport commented 2 years ago

They were just released and should be available within an hour or so.

ashim-mahara commented 2 years ago

I dist-upgraded after being reverted and don't see the login screen now. The screen is flickering, and can't use the ctrl + alt + f2 for another terminal which just vanishes and reverts back to the flickering. Should've been more careful, Pop upgrades ruined me. Prolly will chroot it later.

dlsniper commented 2 years ago

@leviport I can confirm that it's all back to normal now, using

$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.86       Driver Version: 470.86       CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
| N/A   58C    P0    38W /  N/A |   2976MiB /  8116MiB |     14%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2616      G   /usr/lib/xorg/Xorg               2130MiB |
|    0   N/A  N/A      2927      G   /usr/bin/gnome-shell              113MiB |
|    0   N/A  N/A      7784      G   ...646390034963820860,131072      284MiB |
|    0   N/A  N/A      9204      G   /usr/lib/firefox/firefox          220MiB |
|    0   N/A  N/A      9588      G   /usr/lib/firefox/firefox            1MiB |
|    0   N/A  N/A      9590      G   /usr/lib/firefox/firefox            1MiB |
+-----------------------------------------------------------------------------+

Let me know if I can help debug what's happening/why with 510. From what I can see on NVidia's website, 510.54 should still be compatible with my GPU.

I should mention that deja-vu helped me backup/restore the system from that weird state after reinstalling PopOS. Maybe it's worth including it out of the box?

Feel free to close the issue if you don't think it's worth pursuing. Thanks!

leviport commented 2 years ago

What GPU do you have?

dlsniper commented 2 years ago

I have a 1070M, see my original report for more details.

NVIDIA Corporation GP104BM [GeForce GTX 1070 Mobile] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: Micro-Star International Co., Ltd. [MSI] GP104BM [GeForce GTX 1070 Mobile]
johnnynunez commented 2 years ago

the problem is nvidia drivers: This works with kernel 5.16.10+ and 5.17 https://www.nvidia.es/Download/driverResults.aspx/187174/es

510.60.02