Maratyszcza / NNPACK

Acceleration package for neural networks on multi-core CPUs
BSD 2-Clause "Simplified" License
1.67k stars 317 forks source link

Is the fully connected layer faster than using the OpenBLAS equivalent? #168

Closed RuABraun closed 4 years ago

RuABraun commented 4 years ago

No it's a lot slower unfortunately when the input/output has multiple rows. Only a bit faster for the single row case. Probably should just use Arm ComputeLibrary.

Maratyszcza commented 4 years ago

I suggest you take a look at XNNPACK library, which is a successor to NNPACK.

RuABraun commented 4 years ago

Oh nice! Great I'll try it out. :)