nnstreamer / nntrainer

NNtrainer is Software Framework for Training Neural Network Models on Devices.
Apache License 2.0
134 stars 71 forks source link

[hgemm/bugfix] Added conditions for 1x8 and 1x4 kernel calls to enhance accuracy @open sesame 05/09 09:04 #2573

Closed s-debadri closed 1 month ago

s-debadri commented 1 month ago

Changes added in this PR:

Self evaluation:

  1. Build test: [X]Passed [ ]Failed [ ]Skipped
  2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Debadri Samaddar s.debadri@samsung.com

taos-ci commented 1 month ago

:memo: TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2573. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

taos-ci commented 1 month ago

:octocat: cibot: @s-debadri, A builder checker could not be completed because one of the checkers is not completed. In order to find out a reason, please go to http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2573-202405081810420.82360291481018-ba95483e08b45051723aefb4a427b5a83ce3b4d6/.

myungjoo commented 1 month ago

Anyway, especially for non-SIMD operations (it's not going to be SIMDified because it's in if-statement condition), please note that (N & 0x7) might be cheaper than (N % 8).

skykongkong8 commented 1 month ago

Anyway, especially for non-SIMD operations (it's not going to be SIMDified because it's in if-statement condition), please note that (N & 0x7) might be cheaper than (N % 8).

@s-debadri Could you please apply this idea?

s-debadri commented 1 month ago

LGTM.

For the readability of novice developers in the future, please leave a comment that you have applied bitwise operators instead of modulos for the performance. Otherwise, novice developers won't understand why in the heck you are using & 0x... here.

@myungjoo Added the comments as well. Thanks for your suggestion regarding this change.

myungjoo commented 1 month ago

Good to go! Thanks!