nnstreamer / nntrainer

NNtrainer is Software Framework for Training Neural Network Models on Devices.
Apache License 2.0
134 stars 71 forks source link

[ hgemm ] Consider K=1 changes #2654

Closed skykongkong8 closed 5 days ago

skykongkong8 commented 5 days ago
dim = (576, 1) x (1, 1024) fp16 fp32
noTrans 190834 ns 380525 ns
transA 173896 ns 387860 ns
transB 180369 ns 382123 ns
transAB 179263 ns 379238 ns

Since this is K=1 case, we do not need to partial-accumulate w.r.t. K-direction with fp32, thereby accelerated approximately 200%

Self evaluation:

  1. Build test: [X]Passed [ ]Failed [ ]Skipped
  2. Run test: [X]Passed [ ]Failed [ ]Skipped
taos-ci commented 5 days ago

:memo: TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2654. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.