-
Both `UR_DEVICE_INFO_UUID` and `UR_DEVICE_INFO_PCI_ADDRESS` return [`char[]`](https://oneapi-src.github.io/unified-runtime/core/api.html#ur-device-info-t). However, for PCI address, it is a pretty-pri…
-
Every time I start the program, I get this error until a "Result accepted by the pool" message (eventually) pops up. I have tried everything I can think of. I've changed the timeout to 15 seconds up…
-
Many microbenchmarks are very sensitive to cpu migrations. Especially multithreaded and memory intensive ones and even more so on NUMA systems. I am currently pinning the threads myself, but I think i…
-
On many mobile devices, the L1 caches reside on specific CPUs, and L2 caches reside on the set of CPUs that share a clock.
Keeping ui and raster thread on fast CPUs will help to maximize L2 cache hit…
-
*********************************************
温馨提示:根据社区不完全统计,按照模板提问,可以加快回复和解决问题的速度
*********************************************
## 环境
- 【FastDeploy版本】: fastdeploy-gpu-1.0.0
- 【编译命令】
- 【系统平台…
-
I'm running NCCL on two GCP `a3-megagpu-8g` instances with 8 NICs attached, but NCCL is only using one of them. Do you know what I might be doing wrong / how I can troubleshoot this?
nccl's topo fil…
-
### Describe the issue:
Running np.convolve with the default pip numpy install (libopenblas) runs a simple test 45x slower on Win11 than the identical code on the same machine in a WSL2 Debian ins…
-
-
-