microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.56k stars 2.91k forks source link

[Performance] It is not possible to use a discrete graphics card with DML. #19025

Open NeuralAIM opened 9 months ago

NeuralAIM commented 9 months ago

Describe the issue

Q: How can I use an NVidia graphics card? (laptop)

When using device = 0, I ONLY get CPU load (without console outputs) When using device = 1, I get a load ON the CPU + Intel graphics card (without console outputs) When using device = 2, ONLY get CPU load. With information that the processor will be used. ... Strangely: (DML device(0) == CPU | device(1) == device(0) | device(2) - Not Found)

Q: Why are the graphics card numbers different?

DXGI Adapter 0:
===============

DXGI_ADAPTER_DESC3:
-------------------
    Description = NVIDIA GeForce GTX 1050
DXGI Adapter 1:
===============

DXGI_ADAPTER_DESC3:
-------------------
    Description = Intel(R) HD Graphics 630

ScreenShot Task manager: image

>D3d12info.exe --List

============================
D3D12INFO 2.2.0
Built: Dec 16 2023, 14:19:18
Configuration: Release, 64-bit
============================

General:
========
    Current date = 2024-01-05
    D3D12_SDK_VERSION = 611
    NvAPI compiled version = R535-developer
    NvAPI_GetInterfaceVersionString = NVidia Complete Version 1.10
    AMD_AGS_VERSION = 6.2.0
    agsGetVersionNumber = 6.2.0
    Intel GPU Detect compiled version = 2023-07-18

OSVERSIONINFOEX:
----------------
    dwMajorVersion = 10
    dwMinorVersion = 0
    dwBuildNumber = 19045
    dwPlatformId = 2
    szCSDVersion =
    wServicePackMajor = 0
    wServicePackMinor = 0
    wSuiteMask = 0x100
        VER_SUITE_SINGLEUSERTS
    wProductType = VER_SUITE_SMALLBUSINESS (0x1)

System memory:
--------------
    GetPhysicallyInstalledSystemMemory = 8388608 (0x800000) KB (8.00 GB)
    MEMORYSTATUSEX::ullTotalPhys = 8458940416 (0x1f8313000) (7.88 GB)
    MEMORYSTATUSEX::ullTotalPageFile = 15438262272 (0x398313000) (14.38 GB)
    MEMORYSTATUSEX::ullTotalVirtual = 140737488224256 (0x7ffffffe0000) (128.00 TB)

NvAPI_SYS_GetDriverAndBranchVersion:
------------------------------------
    pDriverVersion = 54633
    szBuildBranchString = r545_00

NvAPI_SYS_GetDisplayDriverInfo - NV_DISPLAY_DRIVER_INFO:
--------------------------------------------------------
    driverVersion = 54633
    szBuildBranch = r545_00
    bIsDCHDriver = TRUE
    bIsNVIDIAStudioPackage = FALSE
    bIsNVIDIAGameReadyPackage = TRUE
    bIsNVIDIARTXProductionBranchPackage = FALSE
    bIsNVIDIARTXNewFeatureBranchPackage = FALSE
    szBuildBaseBranch = R545

D3D12EnableExperimentalFeatures:
--------------------------------

DXGI_FEATURE_PRESENT_ALLOW_TEARING:
-----------------------------------
    allowTearing = TRUE

DXGI Adapter 0:
===============

DXGI_ADAPTER_DESC3:
-------------------
    Description = NVIDIA GeForce GTX 1050
    VendorId = NVIDIA (0x10DE)
    DeviceId = 0x1C8D
    SubSysId = 0x39D117AA
    Revision = 0xA1
    DedicatedVideoMemory = 4215275520 (0xfb400000) (3.93 GB)
    DedicatedSystemMemory = 0
    SharedSystemMemory = 4229470208 (0xfc189800) (3.94 GB)
    AdapterLuid = 00000000-0000F51B
    Flags = 0x2C
        DXGI_ADAPTER_FLAG3_ACG_COMPATIBLE
        DXGI_ADAPTER_FLAG3_SUPPORT_MONITORED_FENCES
        DXGI_ADAPTER_FLAG3_KEYED_MUTEX_CONFORMANCE
    GraphicsPreemptionGranularity = DXGI_GRAPHICS_PREEMPTION_PIXEL_BOUNDARY (0x3)
    ComputePreemptionGranularity = DXGI_COMPUTE_PREEMPTION_DISPATCH_BOUNDARY (0x1)

DXGI_QUERY_VIDEO_MEMORY_INFO[DXGI_MEMORY_SEGMENT_GROUP_LOCAL]:
--------------------------------------------------------------
    Budget = 3582984192 (0xd5900000) (3.34 GB)
    AvailableForReservation = 1896873984 (0x71100000) (1.77 GB)

DXGI_QUERY_VIDEO_MEMORY_INFO[DXGI_MEMORY_SEGMENT_GROUP_NON_LOCAL]:
------------------------------------------------------------------
    Budget = 3806523188 (0xe2e2ef34) (3.55 GB)
    AvailableForReservation = 2008998349 (0x77bee1cd) (1.87 GB)

CheckInterfaceSupport:
----------------------
    IDXGIDevice (user mode driver version) = 31.0.15.4633

NvPhysicalGpuHandle:
--------------------
    NvAPI_GPU_GetSystemType = NV_SYSTEM_TYPE_LAPTOP (0x1)
    NvAPI_GPU_GetFullName = NVIDIA GeForce GTX 1050
    NvAPI_GPU_GetPCIIdentifiers - pDeviceID = 0x1C8D10DE
    NvAPI_GPU_GetPCIIdentifiers - pSubSystemId = 0x39D117AA
    NvAPI_GPU_GetPCIIdentifiers - pRevisionId = 0xA1
    NvAPI_GPU_GetPCIIdentifiers - pExtDeviceId = 0x1C8D
    NvAPI_GPU_GetGPUType = NV_SYSTEM_TYPE_DGPU (0x2)
    NvAPI_GPU_GetBusType = NVAPI_GPU_BUS_TYPE_PCI_EXPRESS (0x3)
    NvAPI_GPU_GetVbiosRevision = 2248620544
    NvAPI_GPU_GetVbiosOEMRevision = 40
    NvAPI_GPU_GetVbiosVersionString = 86.07.3a.00.28
    NvAPI_GPU_GetPhysicalFrameBufferSize = 4194176 (0x3fff80) KB (4.00 GB)
    NvAPI_GPU_GetVirtualFrameBufferSize = 4194304 (0x400000) KB (4.00 GB)
    NvAPI_GPU_GetArchInfo - NV_GPU_ARCH_INFO::architecture_id = NV_GPU_ARCHITECTURE_GP100 (0x130)
    NvAPI_GPU_GetArchInfo - NV_GPU_ARCH_INFO::implementation_id = NV_GPU_ARCH_IMPLEMENTATION_NV47 (0x7)
    NvAPI_GPU_GetArchInfo - NV_GPU_ARCH_INFO::revision_id = 0xA1
    NvAPI_GPU_GetVRReadyData - NV_GPU_VR_READY::isVRReady = FALSE
    NvAPI_GPU_QueryIlluminationSupport(NV_GPU_IA_LOGO_BRIGHTNESS) = FALSE
    NvAPI_GPU_QueryIlluminationSupport(NV_GPU_IA_SLI_BRIGHTNESS) = FALSE
    NvAPI_GPU_QueryWorkstationFeatureSupport(NV_GPU_WORKSTATION_FEATURE_TYPE_NVIDIA_RTX_VR_READY) = NVAPI_NOT_SUPPORTED (-104)
    NvAPI_GPU_QueryWorkstationFeatureSupport(NV_GPU_WORKSTATION_FEATURE_TYPE_PROVIZ) = NVAPI_NOT_SUPPORTED (-104)
    NvAPI_GPU_GetMemoryInfoEx - NV_GPU_MEMORY_INFO_EX::dedicatedVideoMemory = 4294967296 (0x100000000) (4.00 GB)
    NvAPI_GPU_GetMemoryInfoEx - NV_GPU_MEMORY_INFO_EX::availableDedicatedVideoMemory = 4215275520 (0xfb400000) (3.93 GB)
    NvAPI_GPU_GetMemoryInfoEx - NV_GPU_MEMORY_INFO_EX::systemVideoMemory = 0
    NvAPI_GPU_GetMemoryInfoEx - NV_GPU_MEMORY_INFO_EX::sharedSystemMemory = 4229468160 (0xfc189000) (3.94 GB)
    NvAPI_GPU_GetMemoryInfoEx - NV_GPU_MEMORY_INFO_EX::curAvailableDedicatedVideoMemory = 4095070208 (0xf415d000) (3.81 GB)
    NvAPI_GPU_GetMemoryInfoEx - NV_GPU_MEMORY_INFO_EX::dedicatedVideoMemoryEvictionsSize = 0
    NvAPI_GPU_GetMemoryInfoEx - NV_GPU_MEMORY_INFO_EX::dedicatedVideoMemoryEvictionCount = 0
    NvAPI_GPU_GetMemoryInfoEx - NV_GPU_MEMORY_INFO_EX::dedicatedVideoMemoryPromotionsSize = 393216 (0x60000) (384.00 KB)
    NvAPI_GPU_GetMemoryInfoEx - NV_GPU_MEMORY_INFO_EX::dedicatedVideoMemoryPromotionCount = 3
    NvAPI_GPU_GetShaderSubPipeCount = 5
    NvAPI_GPU_GetGpuCoreCount = 640
    NvAPI_GPU_GetECCStatusInfo - NV_GPU_ECC_STATUS_INFO::isSupported = FALSE
    NvAPI_GPU_GetECCStatusInfo - NV_GPU_ECC_STATUS_INFO::configurationOptions = NV_ECC_CONFIGURATION_NOT_SUPPORTED (0x0)
    NvAPI_GPU_GetECCStatusInfo - NV_GPU_ECC_STATUS_INFO::isEnabled = FALSE
    NvAPI_GPU_GetRamBusWidth = 128

VkPhysicalDeviceProperties:
---------------------------
    apiVersion = 1.3.260
    driverVersion = 2290630656
    vendorID = 0x10DE
    deviceID = 0x1C8D
    deviceType = VK_PHYSICAL_DEVICE_TYPE_DISCRETE_GPU (0x2)
    deviceName = NVIDIA GeForce GTX 1050

VkPhysicalDeviceIDProperties:
-----------------------------
    deviceUUID = AC28F0626245231CE2D1473234064DC8
    driverUUID = C2A04F03A27E5FB3A136DD945528780D
    deviceLUID = 1BF5000000000000

VkPhysicalDeviceVulkan12Properties:
-----------------------------------
    driverID = VK_DRIVER_ID_NVIDIA_PROPRIETARY (0x4)
    driverName = NVIDIA
    driverInfo = 546.33

DXGI Adapter 1:
===============

DXGI_ADAPTER_DESC3:
-------------------
    Description = Intel(R) HD Graphics 630
    VendorId = Intel (0x8086)
    DeviceId = 0x591B
    SubSysId = 0x39D117AA
    Revision = 0x4
    DedicatedVideoMemory = 134217728 (0x8000000) (128.00 MB)
    DedicatedSystemMemory = 0
    SharedSystemMemory = 4229470208 (0xfc189800) (3.94 GB)
    AdapterLuid = 00000000-0000F0C0
    Flags = 0x2C
        DXGI_ADAPTER_FLAG3_ACG_COMPATIBLE
        DXGI_ADAPTER_FLAG3_SUPPORT_MONITORED_FENCES
        DXGI_ADAPTER_FLAG3_KEYED_MUTEX_CONFORMANCE
    GraphicsPreemptionGranularity = DXGI_GRAPHICS_PREEMPTION_TRIANGLE_BOUNDARY (0x2)
    ComputePreemptionGranularity = DXGI_COMPUTE_PREEMPTION_THREAD_BOUNDARY (0x3)

DXGI_QUERY_VIDEO_MEMORY_INFO[DXGI_MEMORY_SEGMENT_GROUP_LOCAL]:
--------------------------------------------------------------
    Budget = 3806523188 (0xe2e2ef34) (3.55 GB)
    AvailableForReservation = 2008998349 (0x77bee1cd) (1.87 GB)

DXGI_QUERY_VIDEO_MEMORY_INFO[DXGI_MEMORY_SEGMENT_GROUP_NON_LOCAL]:
------------------------------------------------------------------
    Budget = 0
    AvailableForReservation = 0

CheckInterfaceSupport:
----------------------
    IDXGIDevice (user mode driver version) = 31.0.101.2125

VkPhysicalDeviceProperties:
---------------------------
    apiVersion = 1.3.215
    driverVersion = 1656909
    vendorID = 0x8086
    deviceID = 0x591B
    deviceType = VK_PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU (0x1)
    deviceName = Intel(R) HD Graphics 630

VkPhysicalDeviceIDProperties:
-----------------------------
    deviceUUID = 86801B59040000000000000000000000
    driverUUID = 33312E302E3130312E32313235000000
    deviceLUID = C0F0000000000000

VkPhysicalDeviceVulkan12Properties:
-----------------------------------
    driverID = VK_DRIVER_ID_INTEL_PROPRIETARY_WINDOWS (0x5)
    driverName = Intel Corporation
    driverInfo = Intel driver

DXGI Adapter 2:
===============

DXGI_ADAPTER_DESC3:
-------------------
    Description = Microsoft Basic Render Driver
    VendorId = Microsoft (0x1414)
    DeviceId = 0x8C
    SubSysId = 0x0
    Revision = 0x0
    DedicatedVideoMemory = 0
    DedicatedSystemMemory = 0
    SharedSystemMemory = 4229470208 (0xfc189800) (3.94 GB)
    AdapterLuid = 00000000-0000F4C4
    Flags = 0x2E
        DXGI_ADAPTER_FLAG_SOFTWARE
        DXGI_ADAPTER_FLAG3_ACG_COMPATIBLE
        DXGI_ADAPTER_FLAG3_SUPPORT_MONITORED_FENCES
        DXGI_ADAPTER_FLAG3_KEYED_MUTEX_CONFORMANCE
    GraphicsPreemptionGranularity = DXGI_GRAPHICS_PREEMPTION_INSTRUCTION_BOUNDARY (0x4)
    ComputePreemptionGranularity = DXGI_COMPUTE_PREEMPTION_INSTRUCTION_BOUNDARY (0x4)

DXGI_QUERY_VIDEO_MEMORY_INFO[DXGI_MEMORY_SEGMENT_GROUP_LOCAL]:
--------------------------------------------------------------
    Budget = 3806523188 (0xe2e2ef34) (3.55 GB)
    AvailableForReservation = 2008998349 (0x77bee1cd) (1.87 GB)

DXGI_QUERY_VIDEO_MEMORY_INFO[DXGI_MEMORY_SEGMENT_GROUP_NON_LOCAL]:
------------------------------------------------------------------
    Budget = 0
    AvailableForReservation = 0

CheckInterfaceSupport:
----------------------
    IDXGIDevice (user mode driver version) = 10.0.19041.3636

I tested this on other computers where there is only one 3070TI/3060 graphics card - And everything worked as it should.

To reproduce

session = onnxruntime.InferenceSession(w)
session.set_providers(['DmlExecutionProvider'], [{'device_id' : 1}]) #0,1, ...

Platform

Windows

OS Version

Windows 10 22H2 and Windows 11 23H2

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.16.3

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU, DirectML

Execution Provider Library Version

onnxruntime-directml 1.16.3

Is this a quantized model?

No

NeuralAIM commented 9 months ago

Output when using: onnxruntime.set_default_logger_severity(0), is the same for computers with one video card and laptop with two cards. Just like choosing GPU:0 and GPU:1 on a laptop gives the same output to the console.