Open freemin7 opened 1 year ago
Basically, all you have to do is provide vectorised implementation for 4 functions:
ggml_vec_dot_f32()
(optional)ggml_vec_dot_f16()
ggml_vec_mad_f32()
(optional)ggml_vec_mad_f16()
(optional)As a starting point, you can see how @fitzsim added POWER9 support:
You don't necessarily need to fit your implementation into the GGML_SIMD
macros.
These are just to make things more concise instead of having thousands of lines of code of SIMD calls for various architectures.
Just try to implement it straight inside the function body using the respective #ifdef
macro similar to the first PR above.
Regarding FP16 <-> FP32: when no native instructions are available, currently the code fallbacks to this:
https://github.com/ggerganov/ggml/blob/master/src/ggml.c#L163-L237
This should work, right?
I got a NEC VectorEngine Co-processor which uses the "ve" ISA which operates on 16348 bit vectors. This target is not yet supported by this library however depending on the effort for porting it might be an interesting library (although it only supports 32 bit floats). The vector engine can either execute programs natively or it can be used for offloading. How would one go about adding support for a new ISA in this library?
If the ISA docs motivates people familiar with the library to port it to target more then the idea of documenting how to add a new i can give access to hardware running that ISA.
If i read the code correctly instead of "GGML_F32Cx8_ADD" the implementation would use "GGML_F32Cx512_ADD" which is as fun as it sounds. ve provids a VL register which can be used to change the vector length at run time which can be useful for the last loop iteration. The FP16<->FP32 conversion is not supported by the VectorEngine architecture.