Closed RuABraun closed 4 years ago
No it's a lot slower unfortunately when the input/output has multiple rows. Only a bit faster for the single row case. Probably should just use Arm ComputeLibrary.
I suggest you take a look at XNNPACK library, which is a successor to NNPACK.
Oh nice! Great I'll try it out. :)
No it's a lot slower unfortunately when the input/output has multiple rows. Only a bit faster for the single row case. Probably should just use Arm ComputeLibrary.