NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
18.28k stars 14.27k forks source link

Package request: linux-npu-driver #348739

Open vunnyso opened 1 month ago

vunnyso commented 1 month ago

Project description

Intel® NPU (Neural Processing Unit) Driver

Metadata

Add a :+1: reaction to issues you find important.

ferrine commented 2 weeks ago

hey there, I managed to make it work on a fresh Intel Ultra 7 155H (thinkpad X1 12Gen)

the package. It still uses some vendored static dependencies, others are linked to the ones from nixpkgs

{ pkgs, ... }:
let
  # maintainers seem to be already on the update of level-zero, this is the patch that makes it the driver compile
  update-npu-ext-patch = pkgs.fetchpatch {
    name = "update-npu-ext-patch";
    url = "https://github.com/intel/level-zero-npu-extensions/commit/110f48ee8eda22d8b40daeeecdbbed0fc3b08f8b.patch";
    hash = "sha256-Wx1Qy3ZSN37pFq4hOeiVthVXn9TTkJXwEEU9gqTz1qo=";
    stripLen = 1;
    extraPrefix = "third_party/level-zero-npu-extensions/";
  };
in
pkgs.stdenv.mkDerivation {
  pname = "linux-npu-driver";
  version = "1.10.0";

  src = pkgs.fetchFromGitHub {
    fetchSubmodules = true;
    fetchLFS = true;
    owner = "intel";
    repo = "linux-npu-driver";
    rev = "v1.10.0";
    hash = "sha256-/WVj7k6v52Kp1mNU8n2mrql27fo9jVoEYja3zBowITk=";
  };
  outputs = [ "out" "firmware" ];
  patches = [ update-npu-ext-patch ];
  postPatch = ''
    rm -rf third_party/level-zero
    rm third_party/cmake/level-zero.cmake
    rm third_party/cmake/FindLevelZero.cmake
    substituteInPlace third_party/CMakeLists.txt \
      --replace-fail "include(cmake/level-zero.cmake)" ""
    substituteInPlace firmware/CMakeLists.txt \
      --replace-fail "DESTINATION /lib/firmware/updates/intel/vpu/" \
      "DESTINATION $firmware/lib/firmware/intel/vpu/"
    substituteInPlace third_party/level-zero-npu-extensions/ze_graph_ext.h \
      --replace-fail "#include \"ze_api.h\"" "#include <level_zero/ze_api.h>"
    substituteInPlace umd/level_zero_driver/core/source/cmdlist/cmdlist.cpp \
      --replace-fail "ZE_STRUCTURE_TYPE_MUTABLE_GRAPH_ARGUMENT_EXP_DESC" "ZE_STRUCTURE_TYPE_MUTABLE_GRAPH_ARGUMENT_EXP_DESC_DEPRECATED"
  '';
  nativeBuildInputs = [ pkgs.cmake ];
  buildInputs = with pkgs; [
    udev
    boost
    openssl
    level-zero
  ];
  # Optionally provide a meta section for metadata
  meta = {
    description = "Intel® NPU (Neural Processing Unit) Driver";
    homepage = "https://github.com/intel/linux-npu-driver";
    license = pkgs.lib.licenses.mit;
  };
}

and ofc the module to add to nixos configuration

  boot.extraModulePackages = [
    intel-npu-driver
  ];
  hardware.firmware = [
    intel-npu-driver.firmware
  ];

reads from dmesg

[  +0.000032] intel_vpu 0000:00:0b.0: enabling device (0000 -> 0002)
[  +0.001685] intel_vpu 0000:00:0b.0: [drm] Firmware: intel/vpu/vpu_37xx_v0.0.bin, version: 20241025*MTL_CLIENT_SILICON-release*1830*ci_tag_ud202444_vpu_rc_20241025_1830*ae072b315bc
[  +0.014404] [drm] Initialized intel_vpu 1.0.0 for 0000:00:0b.0 on minor 0

After applying patches I ran the test from the repo. All tests pass except some

/build/source/validation/umd-test/test_graph.cpp:70: Failure
Expected: (graph) != (nullptr), actual: (nullptr) vs (nullptr)

[  FAILED  ] GraphApi.GetNativeBinaryUsingMemcpy (0 ms)
[ RUN      ] GraphApi.GetNativeBinaryWithoutMemcpy
NPU_LOG: [DEVICE][vpu_device_context.cpp:34] VPUDeviceContext is created
NPU_LOG: *WARNING* [graph.cpp:291] Failed to get compiler properties!
NPU_LOG: [GRAPH][graph.cpp:394] ze_graph_desc_2_t = format: 0x2, pInput: 0x1285edb0, inputSize: 1808, flags: 0, pBuildFlags: --inputs_precisions="A:fp16 B:fp16 C:fp16" --inputs_layouts="A:C B:C C:C" --outputs_precisions="Y:fp16" --outputs_layouts="Y:C"
NPU_LOG: [CACHE][disk_cache.cpp:121] Cache missed using 9558e4b36b70c7c6a28ca2c6df5da254fb5bd6b0 key
NPU_LOG: *ERROR* [graph.cpp:418] Failed to get compiled blob!
/build/source/validation/umd-test/test_graph.cpp:85: Failure
Expected: (graph) != (nullptr), actual: (nullptr) vs (nullptr)

[  FAILED  ] GraphApi.GetNativeBinaryWithoutMemcpy (0 ms)
[ RUN      ] GraphApi.AppendGraphInitAndExecuteReturnsCorrectError
NPU_LOG: [DEVICE][vpu_device_context.cpp:34] VPUDeviceContext is created
NPU_LOG: *WARNING* [graph.cpp:291] Failed to get compiler properties!
NPU_LOG: [GRAPH][graph.cpp:394] ze_graph_desc_2_t = format: 0x2, pInput: 0x1285edb0, inputSize: 1808, flags: 0, pBuildFlags: --inputs_precisions="A:fp16 B:fp16 C:fp16" --inputs_layouts="A:C B:C C:C" --outputs_precisions="Y:fp16" --outputs_layouts="Y:C"
NPU_LOG: [CACHE][disk_cache.cpp:121] Cache missed using 9558e4b36b70c7c6a28ca2c6df5da254fb5bd6b0 key
NPU_LOG: *ERROR* [graph.cpp:418] Failed to get compiled blob!
/build/source/validation/umd-test/test_graph.cpp:97: Failure
Expected: (graph) != (nullptr), actual: (nullptr) vs (nullptr)

[  FAILED  ] GraphApi.AppendGraphInitAndExecuteReturnsCorrectError (0 ms)
[ RUN      ] GraphApi.SetArgumentPropertiesReturnsCorrectError
NPU_LOG: [DEVICE][vpu_device_context.cpp:34] VPUDeviceContext is created
NPU_LOG: *WARNING* [graph.cpp:291] Failed to get compiler properties!
NPU_LOG: [GRAPH][graph.cpp:394] ze_graph_desc_2_t = format: 0x2, pInput: 0x1285edb0, inputSize: 1808, flags: 0, pBuildFlags: --inputs_precisions="A:fp16 B:fp16 C:fp16" --inputs_layouts="A:C B:C C:C" --outputs_precisions="Y:fp16" --outputs_layouts="Y:C"
NPU_LOG: [CACHE][disk_cache.cpp:121] Cache missed using 9558e4b36b70c7c6a28ca2c6df5da254fb5bd6b0 key
NPU_LOG: *ERROR* [graph.cpp:418] Failed to get compiled blob!
/build/source/validation/umd-test/test_graph.cpp:113: Failure
Expected: (graph) != (nullptr), actual: (nullptr) vs (nullptr)

[  FAILED  ] GraphApi.SetArgumentPropertiesReturnsCorrectError (0 ms)
[ RUN      ] GraphApi.GetArgumentPropertiesReturnsCorrectProperties
NPU_LOG: [DEVICE][vpu_device_context.cpp:34] VPUDeviceContext is created
NPU_LOG: *WARNING* [graph.cpp:291] Failed to get compiler properties!
NPU_LOG: [GRAPH][graph.cpp:394] ze_graph_desc_2_t = format: 0x2, pInput: 0x1285edb0, inputSize: 1808, flags: 0, pBuildFlags: --inputs_precisions="A:fp16 B:fp16 C:fp16" --inputs_layouts="A:C B:C C:C" --outputs_precisions="Y:fp16" --outputs_layouts="Y:C"
NPU_LOG: [CACHE][disk_cache.cpp:121] Cache missed using 9558e4b36b70c7c6a28ca2c6df5da254fb5bd6b0 key
NPU_LOG: *ERROR* [graph.cpp:418] Failed to get compiled blob!
/build/source/validation/umd-test/test_graph.cpp:131: Failure
Expected: (graph) != (nullptr), actual: (nullptr) vs (nullptr)

[  FAILED  ] GraphApi.GetArgumentPropertiesReturnsCorrectProperties (0 ms)
[ RUN      ] GraphApi.GetProperties2
NPU_LOG: [DEVICE][vpu_device_context.cpp:34] VPUDeviceContext is created
NPU_LOG: *WARNING* [graph.cpp:291] Failed to get compiler properties!
NPU_LOG: [GRAPH][graph.cpp:394] ze_graph_desc_2_t = format: 0x2, pInput: 0x1285edb0, inputSize: 1808, flags: 0, pBuildFlags: --inputs_precisions="A:fp16 B:fp16 C:fp16" --inputs_layouts="A:C B:C C:C" --outputs_precisions="Y:fp16" --outputs_layouts="Y:C"
NPU_LOG: [CACHE][disk_cache.cpp:121] Cache missed using 9558e4b36b70c7c6a28ca2c6df5da254fb5bd6b0 key
NPU_LOG: *ERROR* [graph.cpp:418] Failed to get compiled blob!
/build/source/validation/umd-test/test_graph.cpp:177: Failure
Expected: (graph) != (nullptr), actual: (nullptr) vs (nullptr)

[  FAILED  ] GraphApi.GetProperties2 (0 ms)
[----------] 8 tests from GraphApi (2 ms total)

[----------] 5 tests from CommandGraphLongThreaded
[ RUN      ] CommandGraphLongThreaded.RunInferenceUseCommandQueueSynchronize/add_abc_2_Threads
NPU_LOG: [DEVICE][vpu_device_context.cpp:34] VPUDeviceContext is created
NPU_LOG: [CMDQUEUE][cmdqueue.cpp:87] CommandQueue created - 0x12822630
NPU_LOG: [CMDLIST][cmdlist.cpp:84] CommandList created - 0x128384b0
NPU_LOG: *WARNING* [graph.cpp:291] Failed to get compiler properties!
NPU_LOG: [GRAPH][graph.cpp:394] ze_graph_desc_2_t = format: 0x2, pInput: 0x7f6368001720, inputSize: 1808, flags: 0, pBuildFlags: --inputs_precisions="A:fp16 B:fp16 C:fp16" --inputs_layouts="A:C B:C C:C" --outputs_precisions="Y:fp16" --outputs_layouts="Y:C"
NPU_LOG: *WARNING* [graph.cpp:291] Failed to get compiler properties!
NPU_LOG: [GRAPH][graph.cpp:394] ze_graph_desc_2_t = format: 0x2, pInput: 0x7f6370001720, inputSize: 1808, flags: 0, pBuildFlags: --inputs_precisions="A:fp16 B:fp16 C:fp16" --inputs_layouts="A:C B:C C:C" --outputs_precisions="Y:fp16" --outputs_layouts="Y:C"
NPU_LOG: [CACHE][disk_cache.cpp:121] Cache missed using 9558e4b36b70c7c6a28ca2c6df5da254fb5bd6b0 key
NPU_LOG: *ERROR* [graph.cpp:418] Failed to get compiled blob!
NPU_LOG: [CACHE][disk_cache.cpp:121] Cache missed using 9558e4b36b70c7c6a28ca2c6df5da254fb5bd6b0 key
NPU_LOG: *ERROR* [graph.cpp:418] Failed to get compiled blob!
NPU_LOG: [CMDQUEUE][cmdqueue.cpp:87] CommandQueue created - 0x7f6368000cf0
NPU_LOG: [CMDLIST][cmdlist.cpp:84] CommandList created - 0x7f6368001370
NPU_LOG: [CMDQUEUE][cmdqueue.cpp:87] CommandQueue created - 0x7f6370000cf0
NPU_LOG: [CMDLIST][cmdlist.cpp:84] CommandList created - 0x7f6370001370

Tests seem to rely on compiler which I have not built yet

it also implies building

ferrine commented 1 week ago

Ok, I managed to build everything. However, I see the issue coming from an unexpected place:

>>> core.get_property("NPU", props.device.full_name)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: Exception from src/inference/src/cpp/core.cpp:214:
Cannot load library '/nix/store/9fgniggdbmqfz1i849fbvfc7a02j7qih-openvino-2024.4.1/runtime/lib/intel64/libopenvino_intel_npu_plugin.so': /nix/store/zrf9waqkpibfp8rlcc92zi2md8isprpx-gcc-12.4.0-lib/lib/li
bstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /nix/store/in99fi35k65maz0n521sinwqgbqqbkrl-level-zero-1.18.3/lib/libze_loader.so.1)

[nix-shell:~/lore]# ldd /nix/store/in99fi35k65maz0n521sinwqgbqqbkrl-level-zero-1.18.3/lib/libze_loader.so.1
        linux-vdso.so.1 (0x00007f48e5476000)
        libdl.so.2 => /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib/libdl.so.2 (0x00007f48e535f000)
        libstdc++.so.6 => /nix/store/s94fwp43xhzkvw8l8nqslskib99yifzi-gcc-13.3.0-lib/lib/libstdc++.so.6 (0x00007f48e5000000)
        libm.so.6 => /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib/libm.so.6 (0x00007f48e5278000)
        libgcc_s.so.1 => /nix/store/s94fwp43xhzkvw8l8nqslskib99yifzi-gcc-13.3.0-lib/lib/libgcc_s.so.1 (0x00007f48e4fdb000)
        libc.so.6 => /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib/libc.so.6 (0x00007f48e4de2000)
        /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib64/ld-linux-x86-64.so.2 (0x00007f48e5478000)
ferrine commented 1 week ago

If there is anyone knowledgeable of such kind of issues, that would be amazing

I had to create 2 patches for openvino to build the NPU compiler and the plugin, it took some time, but it is not impossible

ferrine commented 1 week ago

CC @ziguana is in level-zero maintainers, maybe there is something relevant to know

ferrine commented 6 days ago

after reading https://github.com/NixOS/nixpkgs/issues/287764 I think the issue is in a different place, CC @SomeoneSerge.

I have a minimal NixOS setup, nothing but the driver, openvino, level-zero

This is ldd of level-zero and openvino, they seem to point to different libstdc++.so.6

root@nixos:~/lore/ > ldd /nix/store/9fgniggdbmqfz1i849fbvfc7a02j7qih-openvino-2024.4.1/runtime/lib/intel64/libopenvino_intel_npu_plugin.so
/nix/store/9fgniggdbmqfz1i849fbvfc7a02j7qih-openvino-2024.4.1/runtime/lib/intel64/libopenvino_intel_npu_plugin.so: /nix/store/zrf9waqkpibfp8rlcc92zi2md8isprpx-gcc-12.4.0-lib/lib/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /nix/store/in99fi35k65maz0n521sinwqgbqqbkrl-level-zero-1.18.3/lib/libze_loader.so.1)
        linux-vdso.so.1 (0x00007fc872726000)
        libopenvino.so.2441 => /nix/store/9fgniggdbmqfz1i849fbvfc7a02j7qih-openvino-2024.4.1/runtime/lib/intel64/libopenvino.so.2441 (0x00007fc871400000)
        libdl.so.2 => /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib/libdl.so.2 (0x00007fc87271b000)
        libtbb.so.12 => /nix/store/2v7ljf8fqxknlnmr1l09f61q1vm272fb-tbb-2021.5.0/lib/libtbb.so.12 (0x00007fc8726c4000)
        libze_loader.so.1 => /nix/store/in99fi35k65maz0n521sinwqgbqqbkrl-level-zero-1.18.3/lib/libze_loader.so.1 (0x00007fc8712f4000)
        libstdc++.so.6 => /nix/store/zrf9waqkpibfp8rlcc92zi2md8isprpx-gcc-12.4.0-lib/lib/libstdc++.so.6 (0x00007fc871000000)
        libm.so.6 => /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib/libm.so.6 (0x00007fc870f19000)
        libgcc_s.so.1 => /nix/store/zrf9waqkpibfp8rlcc92zi2md8isprpx-gcc-12.4.0-lib/lib/libgcc_s.so.1 (0x00007fc8723df000)
        libc.so.6 => /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib/libc.so.6 (0x00007fc870d20000)
        /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib64/ld-linux-x86-64.so.2 (0x00007fc872728000)
root@nixos:~/lore/ > ldd /nix/store/in99fi35k65maz0n521sinwqgbqqbkrl-level-zero-1.18.3/lib/libze_loader.so.1
        linux-vdso.so.1 (0x00007fbf0ba99000)
        libdl.so.2 => /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib/libdl.so.2 (0x00007fbf0b982000)
        libstdc++.so.6 => /nix/store/s94fwp43xhzkvw8l8nqslskib99yifzi-gcc-13.3.0-lib/lib/libstdc++.so.6 (0x00007fbf0b600000)
        libm.so.6 => /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib/libm.so.6 (0x00007fbf0b89b000)
        libgcc_s.so.1 => /nix/store/s94fwp43xhzkvw8l8nqslskib99yifzi-gcc-13.3.0-lib/lib/libgcc_s.so.1 (0x00007fbf0b876000)
        libc.so.6 => /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib/libc.so.6 (0x00007fbf0b407000)
        /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib64/ld-linux-x86-64.so.2 (0x00007fbf0ba9b000)

https://github.com/NixOS/nixpkgs/blame/d211936a6614051aa950cf92290a500a350872e8/pkgs/by-name/op/openvino/package.nix#L39

ferrine commented 6 days ago

Finally, I've fixed weird openvino stdenv and the rest of the puzzle started working

root@nixos:~/ > LD_LIBRARY_PATH=/nix/var/nix/profiles/system/sw/lib/ python -c 'import openvino as ov;import openvino.properties as props;core = ov.Core();print(core.get_property("NPU", props.device.full_name))'
Intel(R) AI Boost
AVAILABLE_DEVICES               : ['3720']
CACHE_DIR                       :
COMPILATION_NUM_THREADS         : 22
DEVICE_ARCHITECTURE             : 3720
DEVICE_GOPS                     : {<Type: 'float16'>: 0.0, <Type: 'float32'>: 0.0, <Type: 'int8_t'>: 0.0, <Type: 'uint8_t'>: 0.0}
DEVICE_ID                       :
DEVICE_PCI_INFO                 : {domain: 0 bus: 0 device: 0xb function: 0}
DEVICE_TYPE                     : Type.INTEGRATED
DEVICE_UUID                     : 80d1d11eb73811eab3de0242ac130004
ENABLE_CPU_PINNING              : False
EXECUTION_DEVICES               : NPU
EXECUTION_MODE_HINT             : ExecutionMode.PERFORMANCE
FULL_DEVICE_NAME                : Intel(R) AI Boost
INFERENCE_PRECISION_HINT        : <Type: 'float16'>
LOG_LEVEL                       : Level.ERR
MODEL_PRIORITY                  : Priority.MEDIUM
NPU_BYPASS_UMD_CACHING          : False
NPU_COMPILATION_MODE_PARAMS     :
NPU_DEVICE_ALLOC_MEM_SIZE       : 0
NPU_DEVICE_TOTAL_MEM_SIZE       : 33138380800
NPU_DRIVER_VERSION              : 315532800
NPU_MAX_TILES                   : 2
NPU_TILES                       : -1
NPU_TURBO                       : False
NUM_STREAMS                     : 1
OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
OPTIMIZATION_CAPABILITIES       : ['FP16', 'INT8', 'EXPORT_IMPORT']
PERFORMANCE_HINT                : PerformanceMode.LATENCY
PERFORMANCE_HINT_NUM_REQUESTS   : 1
PERF_COUNT                      : False
RANGE_FOR_ASYNC_INFER_REQUESTS  : (1, 10, 1)
RANGE_FOR_STREAMS               : (1, 4)
WORKLOAD_TYPE                   : WorkloadType.DEFAULT

yet having same issues with umd-tests

[ RUN      ] GraphApi.GetNativeBinaryUsingMemcpy
/build/source/validation/umd-test/test_graph.cpp:70: Failure
Expected: (graph) != (nullptr), actual: (nullptr) vs (nullptr)

[  FAILED  ] GraphApi.GetNativeBinaryUsingMemcpy (0 ms)
[ RUN      ] GraphApi.GetNativeBinaryWithoutMemcpy
/build/source/validation/umd-test/test_graph.cpp:85: Failure
Expected: (graph) != (nullptr), actual: (nullptr) vs (nullptr)

[  FAILED  ] GraphApi.GetNativeBinaryWithoutMemcpy (0 ms)
[ RUN      ] GraphApi.AppendGraphInitAndExecuteReturnsCorrectError
/build/source/validation/umd-test/test_graph.cpp:97: Failure
Expected: (graph) != (nullptr), actual: (nullptr) vs (nullptr)

[  FAILED  ] GraphApi.AppendGraphInitAndExecuteReturnsCorrectError (0 ms)
[ RUN      ] GraphApi.SetArgumentPropertiesReturnsCorrectError
/build/source/validation/umd-test/test_graph.cpp:113: Failure
Expected: (graph) != (nullptr), actual: (nullptr) vs (nullptr)

[  FAILED  ] GraphApi.SetArgumentPropertiesReturnsCorrectError (0 ms)
[ RUN      ] GraphApi.GetArgumentPropertiesReturnsCorrectProperties
/build/source/validation/umd-test/test_graph.cpp:131: Failure
Expected: (graph) != (nullptr), actual: (nullptr) vs (nullptr)

[  FAILED  ] GraphApi.GetArgumentPropertiesReturnsCorrectProperties (0 ms)
[ RUN      ] GraphApi.GetProperties2
/build/source/validation/umd-test/test_graph.cpp:177: Failure
Expected: (graph) != (nullptr), actual: (nullptr) vs (nullptr)

[  FAILED  ] GraphApi.GetProperties2 (0 ms)
[----------] 8 tests from GraphApi (2 ms total)

[----------] 5 tests from CommandGraphLongThreaded
[ RUN      ] CommandGraphLongThreaded.RunInferenceUseCommandQueueSynchronize/add_abc_2_Threads
zsh: segmentation fault (core dumped)  LD_LIBRARY_PATH=/nix/var/nix/profiles/system/sw/lib npu-umd-test
ferrine commented 2 days ago

it was late night, and I was browsing wrong repo for issues, referenced this issue by mistake