nnstreamer / nntrainer

NNtrainer is Software Framework for Training Neural Network Models on Devices.
Apache License 2.0
135 stars 71 forks source link

[ hgemm ] Use optimized hgemm if possible #2531

Closed skykongkong8 closed 3 months ago

skykongkong8 commented 3 months ago

Through this PR, we can use optimized version of hgemm with following conditions:

  1. noTrans hgemm
  2. M, N, K is divisible with 4 or 8
  3. Row Major GEMM
  4. alpha = 1.0, beta = 0.0 (will be patched soon) Otherwise, use previous version as a fallback.
    • Note that there are a few optimization strategies are left for optimal hgemm.
    • This is just a WIP version of it, but still better than before.
taos-ci commented 3 months ago

:memo: TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2531. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

taos-ci commented 3 months ago

:octocat: cibot: @skykongkong8, nntrainer/tensor/hgemm/hgemm_kernel_8x16.h does not include Doxygen tags such as @file @brief @author @bug. You must include the Doxygen tags in the source code. Please refer to a Doxygen manual at http://github.com/nnstreamer/TAOS-CI/blob/main/ci/doc/doxygen-documentation.md