-
OpenBLAS DGEMM achieves high efficiency, for example, over 90% of peak performance with 1 thread on Graviton3E, but the efficiency drops to about 73% when running DGEMM with 64 threads.
As is known, …
-
I have built openblas in graviton3E with make USE_OPENMP=1 NUM_THREADS=256 TARGET=NEOVERSEV1.
mkl is built in icelake machine.
I have used openblas sgemm as
`cblas_sgemm(CblasRowMajor, CblasNoTr…
-
Hello.
The parameter GEMM_PREFERED_SIZE is set for recent XEON and POWER CPUs, but no specific value is set for Arm CPUs in param.h. Isn’t it better to set the parameter, especially for CPUs with SIM…
-
@Wovchena This is related to #406 but goes deeper, hence I decided to make this a new issue.
## TL;DR
- For `greedy_causal_lm` inference on `arm`, large matmuls (e.g. `1x2x4096:4096x4096` in query…
-
## Expected behavior and actual behavior.
Given:
- multi-file dataset (e.g., Shapefile) w/ one of the files having an uppercased extension (e.g., `.PRJ`)
- e.g., "poly" Shapefile dataset from th…
-
## Expected behavior and actual behavior.
Given:
- CSV dataset w/ both `.csv` index and `.csvt` sidecar.
- For example, `testcsvt.csv` and `testcsvt.csvt` from GDAL autotest data: https://github…