nnstreamer / nntrainer

NNtrainer is Software Framework for Training Neural Network Models on Devices.
Apache License 2.0
135 stars 71 forks source link

[ hdot ] Use precision-enhanced hdot #2555

Closed skykongkong8 closed 2 months ago

skykongkong8 commented 2 months ago

hdot : vec(1 x K ) x vec(K X 1 ) -> 1 (scalar)

A. Accuracy w.r.t cblas-f32 (mse)

dim hdot-f16 (previous) hdot-f16f32 (now)
1024 0.250061 3.72529e-09
2048 2.07308 0.00362171
4096 11.8475 0.195379

B. Latency

Since this is nanosec-unit result, almost trivial

dim hdot-f16 hdot-f16f32 cblas-f32
1024 261 ns 280 ns 347 ns
2048 328 ns 359 ns 1473 ns
4096 508 ns 562 ns 909 ns

Conclusion : Better accuracy with almost no latency deterioration

Self evaluation:

  1. Build test: [X]Passed [ ]Failed [ ]Skipped
  2. Run test: [X]Passed [ ]Failed [ ]Skipped
taos-ci commented 2 months ago

:memo: TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2555. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.