huggingface / candle

Minimalist ML framework for Rust
Apache License 2.0
15.37k stars 905 forks source link

candle-core::quantized::avx contains AVX2 instructions which of course crashes on AVX machines #1818

Open niklasha opened 7 months ago

niklasha commented 7 months ago

Maybe CPU probing should detect whether AVX2 is available as a start. If it is determined that AVX should be supported perhaps let the probe set an enum to what instruction set is available, so that the file can provide both (or more in the future?) versions? I might look into making AVX-clean functions... but I don't promise, I just happened to try mistral on an Xeon E5 2650, and it crashed with SIGILL on mmm256_cvtepu8_epi16. If I get bored, perhaps :-)

LaurentMazare commented 7 months ago

It's not only quantized, the rest of the crate also uses avx2 instruction. The idea is that by default we compile with the native target arch so it will use whatever instructions are available on the box where you compile the code. That said, you can change these flags in your crate that use candle so as to tweak it to your liking. Re avx, I'm not 100% sure if we would want to have them. It will bring a bit of performance for a small portion of users but it will also add some maintenance burden and avx code is quite finicky so that the less we have the better.

niklasha commented 6 months ago

Understood. Just for the record I did compile on that box, which do have avx, just not avx2, so you cannot trust the avx feature to mean it understands avx2. But I am OK with the current behaviour. However, I got a little bit curious how well mistral would fare using just SSE2 (which is what AVX machines without AVX2 has to use for integer SIMD, AVX seems to be for FP, mostly), so I started on writing SSE2 variants of the public operations of quantized::avx. It is always fun to learn sth new as well. never did SIMD programming before :-). vec_dot_q4_0_q8_0 is done, vec_dot_q8_0_q8_0 mostly done. We'll see how much time I find to finish it. But it has been fun and educational so far, so maybe, maybe...