nnstreamer / nntrainer

NNtrainer is Software Framework for Training Neural Network Models on Devices.
Apache License 2.0
144 stars 73 forks source link

[ HGEMM ] Half-Precision GEMM Roadmap #2583

Closed skykongkong8 closed 4 months ago

skykongkong8 commented 4 months ago

1. Objective

Aim of this project is to implement optimal half-precision GEMM working on armv8.2 using NEON.

2. Roadmap

Suppose a GEMM case s.t.

A( M , K ) * B( K , N ) = C( M , N ) 

Step1. Vanilla HGEMM

Step2. Kernel-based HGEMM

Step3. Advanced optimization

Not necessarily, but perhaps we might need them (?)

  • [ ] fused HGEMM with activation
  • [ ] asm-based kernel

3. Keep in mind that...

1. Concerns about precision

2. Justification of optimal GEMM implementation

taos-ci commented 4 months ago

:octocat: cibot: Thank you for posting issue #2583. The person in charge will reply soon.

jijoongmoon commented 4 months ago

It might be better to refer to the PR number for each finished item. I agree about Step 3. We can delay it when we have enough time.

skykongkong8 commented 4 months ago

It might be better to refer to the PR number for each finished item. I agree about Step 3. We can delay it when we have enough time.

Right.. but for detailed process update, I am managing them with >Projects/Half-Precision GEMM Furthermore, I will definitely going to mention this issue for every PR related.

skykongkong8 commented 4 months ago

Anyone who want to discuss further about this issue can reopen this issue. Close temporally, but will be updated time-to-time.