riscv-non-isa / rvv-intrinsic-doc

https://jira.riscv.org/browse/RVG-153
BSD 3-Clause "New" or "Revised" License
277 stars 88 forks source link

Add a section with examples #333

Closed rofirrim closed 2 months ago

rofirrim commented 2 months ago

The examples are non-normative. I've taken a subset of the examples in the examples directory of the repository.

This fixes #319

rofirrim commented 2 months ago

@nick-knight can you take a closer look at the matrix multiplication? I took the existing example but it presumes a transposed matrix in B (I think) so I reimplemented it for a more naive approach so we can show a strided access. Also I understand partial accumulations should use tail undisturbed (regardless of the fact that this is a vfmacc with already 3 input operands).

nick-knight commented 2 months ago

@rofirrim I've always strongly disliked the matmul example in this repo: it's not how a sane person would implement it.

camel-cdr commented 2 months ago

I agree with @nick-knight, I think we should stick with code we can confidently say will portable perform at close to the peak performance.

rofirrim commented 2 months ago

I agree with @nick-knight, I think we should stick with code we can confidently say will portable perform at close to the peak performance.

We are using vector instructions because there is an assumption that those can speed-up our applications. However, it is going to be risky to make claims about performance in the examples. Different implementations will expose different performance characterístics and we do not want to/can cater to each one.

I think the examples should be that, examples, and not necessarily a reference or library of efficient functions. The matrix multiply example is intentionally qualifed as "naive" in the examples for this reason.

rofirrim commented 2 months ago

Hi @kito-cheng thanks a lot for merging this for me.

I will update v1.0.x to the current main so not to stall further work (such as bf16).