snapcore / nvidia-assemble

2 stars 2 forks source link

Version mismatch #7

Open mpena2099 opened 1 month ago

mpena2099 commented 1 month ago

Hello everyone.

What determines which version of the nvidia driver will be installed?

In my case, "nvidia-assemble" installs version 515, but "nvidia-core22" expects version 535, which means it doesn't work (Failed to initialize NVML: Driver/library version mismatch).

jocado commented 1 month ago

Hi @mpena2099

nvidia-assemble doesn't really install anything. It just assembles components that are provided in the pc-kernel snap [ plus a bit of other house keeping ], and sothe pc-kernel snap is where the kernel parts of the nvidia driver come from. The usespace components come from the relevant nvidia-core* snap.

What channel and revision of the pc-kernel snap are you using ? It sounds like it could be out of date perhaps. Also, please confirm what version of Ubuntu Core you are running.

mpena2099 commented 1 month ago

Got it, it makes perfect sense. I'm new to Ubuntu Core and I'm still trying to figure out some things. I installed the following version of Ubuntu Core (https://cdimage.ubuntu.com/ubuntu-core/22/stable/current/), that is, pc-kernel 22/stable, to do some tests and try to run an app of mine, packaged with Snap, that uses GPU.

Probably the latest available version of Ubuntu Core, 24, contains the correct version of the driver compatible with nvidia-core22.

I managed to get my application to work partially by getting an older version of nvidia-core22 ($ snap refresh nvidia-core22 --revision=14), but it seems that something is still not OK, probably because this revision does not have nvidia-smi.

But... OK, all this is not an issue in fact, just problems in the system configuration on my part. So I will close this issue. Thank you very much, @jocado !

jocado commented 1 month ago

No problem. Glad it's coming together @mpena2099 :)

Just a note that Ubuntu Core 24 does not have really have nvidia support yet, indeed there is no nvidia-core24 snap yet. I believe support coming, but it may be slightly different [ perhaps the nvidia-core* snaps will not be needed or will be different, but I don't know for sure ], and so I encourage you to stay with Ubuntu Core 22 for now, until more is known about what's coming.

pc-kernel 22/stable and nvidia-core22 should work together, but it looks like right now you need to use the latest/beta for nvidia-core22

I have the latest/stable pc-kernel snap. If you check the nvidia version:

# modinfo nvidia |grep -E "^version:"
version:        535.183.01

..and you need the matching version from the nvidia-core22 snap:

# snap info nvidia-core22
name:      nvidia-core22
summary:   NVIDIA and Mesa libraries for core22 snaps
publisher: Canonical✓
~~~ 8< ~~~
channels:
  latest/stable:    535.161.08+mesa23.2.1 2024-05-17 (40) 430MB -
  latest/candidate: ↑                                           
  latest/beta:      535.183.01+mesa23.2.1 2024-07-15 (42) 430MB -
  latest/edge:      ↑                                           

Hope that makes sense, and helps a bit further.

jocado commented 1 month ago

@mpena2099 I think you could still argue there is a bug to be filed here, as the correct version should be bumped to stable in sync with the pc-kernel snap release to stable. There is always a slight delay, but the kernel was bumped at least a week ago.

I can't help fix that however, the snap release management if up to Canonical.

mpena2099 commented 1 month ago

OK, so installing Ubuntu Core 24 doesn't seem to be the solution to my problem. In any case, my system at the moment:

$ cat /etc/os-release 
NAME="Ubuntu Core"
VERSION="22"
ID=ubuntu-core
PRETTY_NAME="Ubuntu Core 22"
VERSION_ID="22"
HOME_URL="https://snapcraft.io/"
BUG_REPORT_URL="https://bugs.launchpad.net/snappy/"
$ modinfo nvidia |grep -E "^version:"
version:        515.105.01
$ snap info nvidia-core22
name:      nvidia-core22
summary:   NVIDIA and Mesa libraries for core22 snaps
publisher: Canonical✓
store-url: https://snapcraft.io/nvidia-core22
license:   Proprietary
description: |
  Content snap that provides NVIDIA and Mesa libraries for `base:core22` snaps.
commands:
  - nvidia-core22.smi
snap-id:      4kq2HqQwTNu0dg7j16zDGwve5YY6pUy9
tracking:     latest/beta
refresh-date: today at 17:10 UTC
channels:
  latest/stable:    535.161.08+mesa23.2.1 2024-05-17 (40) 430MB -
  latest/candidate: ↑                                           
  latest/beta:      535.183.01+mesa23.2.1 2024-07-15 (42) 430MB -
  latest/edge:      ↑                                           
installed:          535.183.01+mesa23.2.1            (42) 430MB -
$ snap info pc-kernel 
name:      pc-kernel
summary:   generic linux kernel
publisher: Canonical✓
store-url: https://snapcraft.io/pc-kernel
contact:   snaps@canonical.com
license:   unset
description: |
  The generic Ubuntu kernel package as a snap
type:         kernel
snap-id:      pYVQrBcKmBa0mZ4CCN7ExT6jH8rY1hza
tracking:     22/stable
refresh-date: today at 19:19 UTC
channels:
  (...)
installed:          5.15.0-76.83.1                 (1321) 318MB kernel
jocado commented 1 month ago

@mpena2099 As I mentioned, UC 24 won't work right now, so is not worth pursuing for nvidia use cases right now, but should be in the future.

From the above info, it looks like your kernel snap is out of date, current stable version and revision is:

  22/stable:        5.15.0-116.126.1    2024-07-17 (1926) 351MB -

So if you just refresh the kernel snap, then it should be ok.

There is still the issue that you ahve to use the beat channel for nvidia-core22 snap - but that's it.

mpena2099 commented 1 month ago
$ sudo snap refresh pc-kernel
error: cannot perform the following tasks:
- Update assets from kernel "pc-kernel" (1926) (could not map volume pc from gadget.yaml to any physical disk: cannot find physical disk laid out to map with volume pc)

:sweat:

jocado commented 1 month ago

Is there any chance you have resized or otherwise changed the partition layout on the disk after booting the image for the first time ? snapd expects to manage the whole disk, with nothing else present.

mpena2099 commented 1 month ago

Your guess is absolutely right, @jocado.

Initially, for my migration tests from Ubuntu Desktop to Ubuntu Core, I intended to keep both operating systems on the same PC. First I installed Core, then I resized the partition and created a new one to install Desktop. It didn't work because I couldn't use Grub with Core, so I abandoned the idea, but the partition is still there.

Later I'll delete this partition and go back to how it was before and, if that's not enough, reinstall Ubuntu Core. Thanks again.

mpena2099 commented 1 month ago

Ok, it seems to have worked now. What I did was:

  1. Removed the partition I had created and resized the existing Ubuntu Core partition.

  2. Updated the "pc-kernel": $ sudo snap refresh pc-kernel

  3. Removed "nvidia-assemble" and "nvidia-core22", and rebooted the system.

  4. Reinstalled "nvidia-assemble" and "nvidia-core22 (latest/beta)": $ snap install nvidia-assemble --channel 22/stable $ sudo snap refresh nvidia-core22 --channel latest/beta

Now, "$ nvidia-core22.smi" is working! \o/