Closed skykongkong8 closed 4 months ago
:octocat: cibot: Thank you for posting issue #2583. The person in charge will reply soon.
It might be better to refer to the PR number for each finished item. I agree about Step 3. We can delay it when we have enough time.
It might be better to refer to the PR number for each finished item. I agree about Step 3. We can delay it when we have enough time.
Right.. but for detailed process update, I am managing them with >Projects/Half-Precision GEMM Furthermore, I will definitely going to mention this issue for every PR related.
Anyone who want to discuss further about this issue can reopen this issue. Close temporally, but will be updated time-to-time.
1. Objective
Aim of this project is to implement optimal half-precision GEMM working on armv8.2 using NEON.
2. Roadmap
Step1. Vanilla HGEMM
Step2. Kernel-based HGEMM
Step3. Advanced optimization
3. Keep in mind that...
1. Concerns about precision
nvidia fp16 paper
hyperclova
gemmlowp
2. Justification of optimal GEMM implementation