Open gemiduck opened 1 year ago
I tried getting BLIS to work previously, but it wouldn't compile correctly on my system for some reason. Anecdotally, from what I've read MKL should be faster than OpenBLAS, but a lot of it appears to be proprietary. You could certainly try swapping the BLAS libraries with BLIS/MKL if you can, assuming your libraries are compiled and installed correctly it should be only a few lines changed, since the function signature for cblas_sgemm
should be similar.
Hi you have the lines needed to compile with MKL?, i'm testing with my I5 11400F (server), to view if i can get it faster than my R5 3600 server
if you mean Intel MKL, it's open source now, under Apache license (now called oneMKL), https://github.com/oneapi-src/oneMKL And I think this might be a big benefit to Intel ARC GPUs, while also being compatible with other systems as well (the library supports both CUDA and ROCm.) Not a dev so no idea if that would actually be useful or not.
Hi, is it possible to use either BLIS or MKL instead of OpenBLAS? I'm using a AMD EPYC 7543 and the performance without it is much faster, so I'm wondering if either of the two would help prompt eval time.
python3 koboldcpp.py --threads 4 --smartcontext ../model_13b.bin
Without OpenBLAS:
With BLAS:
I'm wondering whether this is a multi-thread issue, as the timing I get with one BLAS thread is comparable, but still slightly higher.