Closed teor292 closed 3 years ago
It signilizes to wrong algorithm of L3 cache size getting on ARMv7. Unfortunately I can't get access to this platform now to reproduce bug. I can try to fix this issue but can't check it.
Ok, but I can check it :) If you need some kind of debugging information, I can try to provide it.
Could you check bug fix? SimdBaseSynetConvolution32f.zip
Well, now the neural network is loading, but freezes during calculation. I'll try to find a moment.
Ok, I found it.
In file SimdGemm.h:
at line 371:
void Run(size_t M, const T * A, size_t lda, const T * pB, T * C, size_t ldc)
{
assert(M <= _M);
for (size_t j = 0; j < _N; j += _macroN)
_macroN == 0
Here is callstack:
1 Simd::GemmNNcb<float, 4u, unsigned int>::Run SimdGemm.h 376 0x421db0
2 Simd::GemmNNcb<float, 4u, unsigned int>::Run SimdGemm.h 368 0x3bde88
3 Simd::Neon::Gemm32fNNcbRun SimdNeonGemm32f.cpp 2433 0x3bde88
4 Simd::GemmCbFunc::Run SimdRuntime.h 280 0x2c7b38
5 Simd::Runtime<Simd::GemmCbFunc, Simd::GemmCbArgs>::Test SimdRuntime.h 156 0x2c7b38
6 Simd::Runtime<Simd::GemmCbFunc, Simd::GemmCbArgs>::Run SimdRuntime.h 90 0x2c7b38
7 Simd::Base::SynetConvolution32fGemmNN::Forward SimdBaseSynetConvolution32f.cpp 367 0x2c7b38
8 SimdSynetConvolution32fForward SimdLib.cpp 5446 0x23a1f0
9 Synet::Convolution32f::Forward Convolution.h 109 0x16cb94
10 Synet::Convolution32fLayer<float>::ForwardCpu Convolution32fLayer.h 85 0x16cb94
11 Synet::Convolution32fLayer<float>::ForwardCpu Convolution32fLayer.h 79 0x1178c4
12 Synet::Layer<float>::Forward Layer.h 146 0x955f0
13 Synet::Network<float>::Forward Network.h 356 0x8c5f4
14 SynetTester::run_once_ SynetTester.cpp 43 0x3db80
15 BaseTester::Run BaseTester.cpp 27 0x1bf58
16 main main.cpp 14 0x1c3794
The second iteration: SimdGemm.zip
Everything works, thank you!
It's good news! I commited changes.
Hi. I use Simd with Synet to test perfomance of some neural networks. It works fine on Windows (x64) and Linux (x64). But on ARMv7 compiled with gcc 6.3.0 it freezes.
I don't know the reason, but here what I found.
Here is start piece of code of SynetConvolution32fNhwcDirect::OldReorderWeight:
Parameters of AlgParam on x64 are as follows:
Others contain some trash as I think.
But on ARMv7 this values are as follows:
So
for (size_t da = 0; da < p.dstC; da += a.macroD)
go to infinite loop (p.dstC == 10). I am hope for your help.