Closed frenchwr closed 1 month ago
At first glance it looks like the automated build may be failing due to an issue with our self-hosted runners or with GitHub itself. I will try re-running in the morning.
[ 22%] Performing download step (download, verify and extract) for 'node-api-headers-populate'
:: -- Downloading...
:: dst='/root/parts/npu-driver/build/third_party/vpux_plugin/build-cid/_deps/node-api-headers-subbuild/node-api-headers-populate-prefix/src/v1.1.0.tar.gz'
:: timeout='none'
:: inactivity timeout='none'
:: -- Using src='https://github.com/nodejs/node-api-headers/archive/refs/tags/v1.1.0.tar.gz'
:: CMake Error at node-api-headers-subbuild/node-api-headers-populate-prefix/src/node-api-headers-populate-stamp/download-node-api-headers-populate.cmake:170 (message):
:: Each download failed!
::
:: error: downloading 'https://github.com/nodejs/node-api-headers/archive/refs/tags/v1.1.0.tar.gz' failed
:: status_code: 22
:: status_string: "HTTP response code said error"
:: log:
:: --- LOG BEGIN ---
:: Host github.com:443 was resolved.
@frenchwr Awesome ! I m not sure of technical feasability here but I m wondering if there is a way to do load the kernel as part of the hook called upon the intel-npu-fw
interface connection as before.
something like this in the snapcraft:
hooks:
connect-plug-intel-npu-fw:
plugs:
- intel-npu-kmod
It would make sense to me to tight the kernel module load on this interface connection and also to load the kernel module on actual use (instead of having a daemon that continuously tries and fails if the users do not yet connect the appropriate interfaces - because they do not want to use NPU even if the snap is installed on the system).
What do you think ?
@frenchwr Awesome ! I m not sure of technical feasability here but I m wondering if there is a way to do load the kernel as part of the hook called upon the
intel-npu-fw
interface connection as before.something like this in the snapcraft:
hooks: connect-plug-intel-npu-fw: plugs: - intel-npu-kmod
It would make sense to me to tight the kernel module load on this interface connection and also to load the kernel module on actual use (instead of having a daemon that continuously tries and fails if the users do not yet connect the appropriate interfaces - because they do not want to use NPU even if the snap is installed on the system).
What do you think ?
I originally pursued this path but there were problems:
/sys/module/firmware_class/parameters/path
does not survive reboots, and hook-based mechanism does not trigger on reboot. Snap interface connections (even manual ones) persist on reboot, so a plug's connect-plug*
hook does not run on reboot. I don't love the daemon solution - I don't think it's perfect - but it does solve the reboot issue as it's run as a systemd service on boot. A more permanent solution would be writing a new snapd interface that allows the snap to write the firmware binary blobs to a conventional path on the host (e.g. /usr/lib/firmware/intel
).connect-plug*
hook should be used? Whichever you choose, you need to ensure that the other interface has first connected. If the snapstore team approves autoconnecting both interfaces, is there a mechanism for controlling the order at which they connect? As you can see it gets messy.@frenchwr Awesome ! I m not sure of technical feasability here but I m wondering if there is a way to do load the kernel as part of the hook called upon the
intel-npu-fw
interface connection as before. something like this in the snapcraft:hooks: connect-plug-intel-npu-fw: plugs: - intel-npu-kmod
It would make sense to me to tight the kernel module load on this interface connection and also to load the kernel module on actual use (instead of having a daemon that continuously tries and fails if the users do not yet connect the appropriate interfaces - because they do not want to use NPU even if the snap is installed on the system). What do you think ?
I originally pursued this path but there were problems:
- The custom firmware path in
/sys/module/firmware_class/parameters/path
does not survive reboots, and hook-based mechanism does not trigger on reboot. Snap interface connections (even manual ones) persist on reboot, so a plug'sconnect-plug*
hook does not run on reboot. I don't love the daemon solution - I don't think it's perfect - but it does solve the reboot issue as it's run as a systemd service on boot. A more permanent solution would be writing a new snapd interface that allows the snap to write the firmware binary blobs to a conventional path on the host (e.g./usr/lib/firmware/intel
).- You need access to two interfaces from a hook that runs when only one of these interfaces connects - so which interfaces's
connect-plug*
hook should be used? Whichever you choose, you need to ensure that the other interface has first connected. If the snapstore team approves autoconnecting both interfaces, is there a mechanism for controlling the order at which they connect? As you can see it gets messy.
I did know that the manual connections stay on reboot and hooks are not invoked again.
A more permanent solution would be writing a new snapd interface that allows the snap to write the firmware binary blobs to a conventional path on the host (e.g. `/usr/lib/firmware/intel`).
Did you think also to the approach to modify the kernel boot command line (via grub) to add the firmware path to the vpu module ? This change is going to be permanent across reboot also.
For your second bullet, i see, let me think about it but probably you make the point
Did you think also to the approach to modify the kernel boot command line (via grub) to add the firmware path to the vpu module ? This change is going to be permanent across reboot also.
Is that possible? Does the intel_vpu
kernel module accept the path as a param? I don't see anything like that in the source. If it is supported then I agree it's definitely worth consideration.
Did you think also to the approach to modify the kernel boot command line (via grub) to add the firmware path to the vpu module ? This change is going to be permanent across reboot also.
Is that possible? Does the
intel_vpu
kernel module accept the path as a param? I don't see anything like that in the source. If it is supported then I agree it's definitely worth consideration.
I believe yes, i already comment about that in the first MR, here is the source code : https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/accel/ivpu/ivpu_fw.c?h=linux-6.9.y#n48
I believe yes, i already comment about that in the first MR, here is the source code : https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/accel/ivpu/ivpu_fw.c?h=linux-6.9.y#n48
Awesome - now I remember. Let me think about it and test.
I've given this some thought and researched a bit. My overall thought is that updating a system's kernel cmdline
feels too obtrusive and risky. This snap is designed for laptop and desktop usage, so I think we should strongly prefer a design that limits changes to the whole system.
The custom firmware path at /sys/module/firmware_class/parameters/path
, while less permanent, does offer good flexibility. If for some reason the binary blobs are not in the custom path, the kernel module will next search in the conventional system locations. It's not as clear to me (from the source code) how the kernel module would behave if we passed the path as a param but the firmware was not found on that path.
There's also the issue of kernel upgrades. I guess we would need some mechanism for also updating the kernel boot command any time a new kernel is installed.
Gadget snaps do give you the ability to customize the kernel boot command, but these are intended for IoT devices. I do not see any other type of grub interface support within snapd.
- intel_vpu
I believe kernel command-line is persistent through kernel upgrades, However, i agree that doing system-wide configuration change is not ideal
But, i was thinking if we can give the custom firmware path to vpu driver directly (through driver parameter) instead of using the parameter path
of firmware_class
So we can remove the need for kernel-firmware-control
and the contents of the script load-npu-firmware
will look like
...
# Load NPU firmware from custom path
rmmod intel_vpu
modprobe intel_vpu ivpu_firmware=xxx
Of course, this approach depends on vpu driver feature to take custom firmware path and will no longer work if the vpu driver removes the support for this feature in the future.
- intel_vpu
I believe kernel command-line is persistent through kernel upgrades, However, i agree that doing system-wide configuration change is not ideal
But, i was thinking if we can give the custom firmware path to vpu driver directly (through driver parameter) instead of using the parameter
path
offirmware_class
So we can remove the need for
kernel-firmware-control
and the contents of the scriptload-npu-firmware
will look like... # Load NPU firmware from custom path rmmod intel_vpu modprobe intel_vpu ivpu_firmware=xxx
Of course, this approach depends on vpu driver feature to take custom firmware path and will no longer work if the vpu driver removes the support for this feature in the future.
I've been testing and just noticed from the intel_vpu
source you referenced:
MODULE_PARM_DESC(firmware, "NPU firmware binary in /lib/firmware/..");
Note the /lib/firmware
as the base location, as though that is required. I've attempted multiple variations of the path to the firmware shipped with the snap and nothing has worked yet, but I'll look again tomorrow.
[90402.526319] intel_vpu 0000:00:0b.0: Direct firmware load for /var/snap/intel-npu-driver/current/ failed with error -2
[90657.399847] intel_vpu 0000:00:0b.0: Direct firmware load for /var/snap/intel-npu-driver/current/intel/vpu/vpu_37xx_v0.0.bin failed with error -2
[90683.971593] intel_vpu 0000:00:0b.0: Direct firmware load for ../../var/snap/intel-npu-driver/current/intel/vpu/vpu_37xx_v0.0.bin failed with error -2
[90720.210621] intel_vpu 0000:00:0b.0: Direct firmware load for ../var/snap/intel-npu-driver/current/intel/vpu/vpu_37xx_v0.0.bin failed with error -2
modprobe intel_vpu ivpu_firmware=
@frenchwr I m sorry to make you loose your time for probably nothing
I took a closer look at the source code and seems that the firmware
parameter of the vpu
driver is not a path but only the firmware file name.
modprobe intel_vpu ivpu_firmware=
@frenchwr I m sorry to make you loose your time for probably nothing
I took a closer look at the source code and seems that the
firmware
parameter of thevpu
driver is not a path but only the firmware file name.
No worries! I think this is a good time for us to make sure we're considering all possibilities.
Internal Jira card: PEK-1255
This addresses issues raised in https://github.com/canonical/intel-npu-driver-snap/issues/2 by:
intel_vpu
kernel module using the kernel-module-control snap interfaceTesting
Test daemon updates path and re-loads kernel module
The daemon initially fails because the snap interfaces are not connected. Once connected, the daemon runs successfully and updates the search path. Ultimately this will (pending approval from the snapstore team) be automated with autoconnections.
Test search path updated following reboot
Note: there is a ~7 second period at boot where the device has the OS-provided firmware loaded instead of the one shipped by the snap. This is in the very early stages of the boot sequence so I think should be acceptable but would love to hear others' thoughts.
I also ran
intel-npu-driver.vpu-emd-test --config=basic.yaml
(as described in the README) and verified that the expected tests passed following a reboot.Test OS-provided firmware loaded on snap removal
The last command shows the sequence of steps run by snapd when a snap is removed. An important detail is that the
remove
hook is the second step, and runs before the snap interfaces are disconnected. This is important because theremove
hook needs permissions provided by these interfaces to unset the firmware path and to re-load the kernel module with the OS-provided firmware.