uncomplicate / neanderthal

Fast Clojure Matrix Library
http://neanderthal.uncomplicate.org
Eclipse Public License 1.0
1.06k stars 56 forks source link

Integer matrices on MKL? #99

Closed chrysophylax closed 3 years ago

chrysophylax commented 3 years ago

Hi,

I am looking at working with dense matrices containing exclusively integers using MKL. I have been unable to do it using neanderthal - am I missing something or are they just not implemented? I keep hitting on AbstractMethodError which makes me think they're not linked in yet.

I am not sure how to go about adding it myself unfortunately.

blueberry commented 3 years ago

MKL and other low-level libraries do not provide operations for integer-based matrices and vectors. There is a reason: as soon as your matrix contains many entries, and as soon as these entries become large-ish there is a large chance of integer overflow. Why floating point matrices are not good for your use case?

chrysophylax commented 3 years ago

Yes, we can overflow but MKL does provide extensions to BLAS that operate on integer-based matrices that work well. Here's the link to MKL developer reference cblasgemm* Computes a matrix-matrix product with general integer matrices.

VNNI extensions make it pretty fast to do some calculations with mkl-int. image

All in all, you can get better throughput, lower memory usage, lower power usage in exchange for an acceptable loss of precision.

It is not "super-necessary" I would say, I can work with fp but I'd like to see neanderthal support it.

Edit: Here's another slide from an Intel presentation showing the speed-up for inference when you switch to Int8 and since Neanderthal is a speed-demon, it feels fitting to include this ability. image

blueberry commented 3 years ago

The gemm_* you linked to can compute some integer types, but then the problem is what to do with all other operations, which still stay unimplemented. Please also note that current hardware is optimized for floats. Integers may have use in some special applications, but then you'll have to have AVX_512, and shorter, 1 or 2 byte integers to see any benefits.

OTOH, Deep Diamond's inner product operation supports (some) integers, so that may be what you're looking for.

chrysophylax commented 3 years ago

OK, I guess that will have to do. I will take a look.

Thanks for a nice discussion and a nicer library :)