jiangcuo / Proxmox-Port

Proxmox VE arm64 riscv64 loongarch64
GNU Affero General Public License v3.0
843 stars 44 forks source link

是否有可能在虚拟机内使用RK3588的NPU #130

Closed fevenor closed 2 weeks ago

fevenor commented 2 weeks ago

已阅读Resource_PassThrough,了解到PCIE直通目前无法实现。 进一步阅读VFIO details,了解到对于ARM平台的QEMU,可使用以下参数直通VFIO_PLATFORM

-device vfio-platform,sysfsdev=/sys/bus/platform/devices/ee300000.sata

而对于Proxmox,其qm命令并没有此类参数。在Proxmox中直连Rockchip平台的NPU设备是否仍有可能?

jiangcuo commented 2 weeks ago

写到args里面

fevenor commented 2 weeks ago

写到args里面

似乎将NPU直通给虚拟机会导致内核崩溃,放弃这个打算了。

[  740.669811] Internal error: synchronous external abort: 0000000096000010 [#1] SMP
[  740.678053] Modules linked in: nf_conntrack_netlink xt_nat xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype nft_compat nf_tables overlay veth rpcsec_gss_krb5 ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables sctp ip6_udp_tunnel udp_tunnel iptable_filter bridge stp llc bonding tls zstd zram zsmalloc nfnetlink_log nfnetlink binfmt_misc rk805_pwrkey pwm_fan nvmem_rockchip_otp panfrost drm_shmem_helper rockchip_cpuinfo gpu_sched uio_pdrv_genirq uio vhost_net tun vhost vhost_iotlb tap fuse dm_mod ip_tables ipv6
[  740.735539] CPU: 3 PID: 2482 Comm: bash Not tainted 6.1.75-vendor-rk35xx #1
[  740.743147] Hardware name: Turing Machines RK1 (DT)
[  740.748481] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  740.756091] pc : readl+0x4/0x20
[  740.759550] lr : rk_iommu_is_stall_active+0x60/0x68
[  740.764895] sp : ffff80000cbc3890
[  740.768528] x29: ffff80000cbc3890 x28: ffff000105e2bd00 x27: 0000000000000000
[  740.776335] x26: 00000000fffffff0 x25: ffff8000080d45b0 x24: 0000000000000000
[  740.784142] x23: ffff0001001260f4 x22: ffff000104552840 x21: 0000000000000001
[  740.791948] x20: ffff000103c81e80 x19: 0000000000000001 x18: 0000000000000000
[  740.799754] x17: 0000000000000000 x16: 0000000000000000 x15: 0000aaaaddde7f30
[  740.807544] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[  740.815300] x11: 0000000000000000 x10: 0000000000000000 x9 : ffff8000087d8554
[  740.823055] x8 : 0000000000000000 x7 : ffff80000a4e1520 x6 : 000000000002d0c4
[  740.830811] x5 : 0000000423882d79 x4 : 0000000000000000 x3 : ffff000105e2bd00
[  740.838566] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff80000b1f5004
[  740.846324] Call trace:
[  740.848990]  readl+0x4/0x20
[  740.852038]  rk_iommu_enable_stall+0x11c/0x138
[  740.856873]  rk_iommu_enable+0x48/0x230
[  740.861048]  rk_iommu_resume+0x50/0x64
[  740.865125]  pm_generic_runtime_resume+0x30/0x44
[  740.870149]  __rpm_callback+0x4c/0x12c
[  740.874227]  rpm_callback+0x78/0x7c
[  740.878022]  rpm_resume+0x3b0/0x44c
[  740.881818]  __pm_runtime_resume+0x74/0x9c
[  740.886272]  rpm_get_suppliers+0x50/0xc0
[  740.890542]  __rpm_callback+0xa4/0x12c
[  740.894619]  rpm_callback+0x78/0x7c
[  740.898413]  rpm_resume+0x3b0/0x44c
[  740.902208]  __pm_runtime_resume+0x74/0x9c
[  740.906662]  pm_runtime_get_sync.isra.0+0x14/0x20
[  740.911783]  device_release_driver_internal+0x4c/0x150
[  740.917379]  device_driver_detach+0x20/0x2c
[  740.921933]  unbind_store+0x60/0x90
[  740.925729]  drv_attr_store+0x30/0x44
[  740.929719]  sysfs_kf_write+0x44/0x58
[  740.933714]  kernfs_fop_write_iter+0xc0/0x178
[  740.938453]  vfs_write+0x154/0x1b8
[  740.942161]  ksys_write+0x78/0xe4
[  740.945772]  __arm64_sys_write+0x20/0x2c
[  740.950043]  invoke_syscall+0x8c/0x128
[  740.954124]  el0_svc_common.constprop.0+0xd8/0x128
[  740.959338]  do_el0_svc+0xac/0xbc
[  740.962947]  el0_svc+0x2c/0x54
[  740.966279]  el0t_64_sync_handler+0xac/0x13c
[  740.970931]  el0t_64_sync+0x19c/0x1a0
fevenor commented 2 weeks ago

可行的替代方案:使用lxc容器 在配置文件中,添加NPU设备renderD129及其相关的card1

lxc.apparmor.profile: unconfined
lxc.cgroup.devices.allow: a
lxc.cap.drop: 
lxc.cgroup2.devices.allow: c 226:1 rwm
lxc.cgroup2.devices.allow: c 226:129 rwm
lxc.mount.entry: /dev/dri/card1 dev/dri/card1 none bind,optional,create=file
lxc.mount.entry: /dev/dri/renderD129 dev/dri/renderD129 none bind,optional,create=file