ggerganov / ggml

Tensor library for machine learning
MIT License
11.26k stars 1.05k forks source link

[Documentation request] How to add a new ISA target? #28

Open freemin7 opened 1 year ago

freemin7 commented 1 year ago

I got a NEC VectorEngine Co-processor which uses the "ve" ISA which operates on 16348 bit vectors. This target is not yet supported by this library however depending on the effort for porting it might be an interesting library (although it only supports 32 bit floats). The vector engine can either execute programs natively or it can be used for offloading. How would one go about adding support for a new ISA in this library?

If the ISA docs motivates people familiar with the library to port it to target more then the idea of documenting how to add a new i can give access to hardware running that ISA.

If i read the code correctly instead of "GGML_F32Cx8_ADD" the implementation would use "GGML_F32Cx512_ADD" which is as fun as it sounds. ve provids a VL register which can be used to change the vector length at run time which can be useful for the last loop iteration. The FP16<->FP32 conversion is not supported by the VectorEngine architecture.

ggerganov commented 1 year ago

Basically, all you have to do is provide vectorised implementation for 4 functions:

As a starting point, you can see how @fitzsim added POWER9 support:

You don't necessarily need to fit your implementation into the GGML_SIMD macros. These are just to make things more concise instead of having thousands of lines of code of SIMD calls for various architectures. Just try to implement it straight inside the function body using the respective #ifdef macro similar to the first PR above.

Regarding FP16 <-> FP32: when no native instructions are available, currently the code fallbacks to this:

https://github.com/ggerganov/ggml/blob/master/src/ggml.c#L163-L237

This should work, right?