ermig1979 / Simd

C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC, NEON for ARM.
http://ermig1979.github.io/Simd
MIT License
2.04k stars 407 forks source link

_common[thread].sum and _common[thread].dst #80

Closed sxsong1207 closed 5 years ago

sxsong1207 commented 5 years ago

Hi,

I found each layer declare these two same dimension data block. Sum is value before active func, Dst are value after active func.

Seems it is not necessary to store those values in sum vector,right? As I'm working on a network with large input size(2000x2000 and bigger), release sum vector could save up to half of RAM for me.

I tried in my code, it's forward-only convolutional-only, it didn't change anything but RAM cost.

I disabled these lines and change sum into dst

void Forward(const Vector & src, size_t thread, Method method) override
            {
                // Vector & sum = _common[thread].sum;

virtual void SetThreadNumber(size_t number, bool train) { _common.resize(number); for (size_t i = 0; i < _common.size(); ++i) { //_common[i].sum.resize(_dst.Volume());

ermig1979 commented 5 years ago

Hello. Simd::Neural Framework is not optimal to work with so big images (I created it in order to recognize symbols). I would recommend you to use Synet Framework (it is mach faster and allows to use model trained with using of Caffe, Darknet, Tensorflow, and OpenCV::Dnn).

sxsong1207 commented 5 years ago

Looks great, you did great contribution on cpu accel nn, thank you!