-
### 🚀 The feature, motivation and pitch
libtorch 1.12.0 is currently released with version 2020.0
please make torch compatible with latest version of MKL as it is much faster. I note that the `cblas…
-
# Summary
Request to implement lapack::gesv for lapack_mklcpu, lapack_mklgpu, and lapack_cusolver library.
# Problem statement
I use lapack::gesv to solve a linear matrix equation, or system of l…
-
# Summary
I'm trying to use `gemm` on PVC, but it keeps throwing an exception. Please let me know where I'm going wrong.
I am attempting to use `gemm` and execute on a 4oam PVC system on ORTCE. …
-
# Description
We recently had a customer request to provide APIs in Intel(R) oneMKL, for providing the version of the oneAPI spec that the product is compliant with.
During this investigation, I f…
-
Dear,
I created a new virtual env, and installed all the packages, however, when I run the code, it does not run on GPU (XPU).
here is the log:
ipex-llm\python\llm\dev\benchmark\all-in-one>python r…
-
Unfortunately, building faiss has been a major challenge.
There are several problems:
- cuda 12.2.2 not used
- setup.py during wheel generation does not include python callbacks shared object
…
sdake updated
2 months ago
-
Transpose operations that earlier was applied for all Matmul/Conv operations was fused with it ops and covered with BLAS implementation. Maybe it's possible to convert in the similar way other common …
-
It seems like iterative refinement is the default.
https://github.com/JuliaSparse/Pardiso.jl/blob/81c8391e86cebaf71e936363664d878b22b1fd9a/src/Pardiso.jl#L242
Discussion on iterative refinement fo…
-
dpctl can detect cuda devices:
```
In [2]: import dpctl
In [3]: dpctl.get_devices()
Out[3]: []
```
but `dpctl.program` can't create kernels, [only `level_zero` and `opencl` are supported](…
-
Hi all!
Is there a straightforward way to wrap the threadripper/epyc optimized AMD FFTW library with pyFFTW? https://developer.amd.com/amd-aocl/fftw/
We are having big threading problems with MK…