luoyetx / mini-caffe

Minimal runtime core of Caffe, Forward only, GPU support and Memory efficiency.
BSD 3-Clause "New" or "Revised" License
374 stars 151 forks source link

Layer::Forward does not work when inference faster-rcnn #90

Open spacegrass opened 6 years ago

spacegrass commented 6 years ago

I find the master code of layer::Forward has be changed like this:

inline void Layer::Forward(const vector<Blob>& bottom, const vector<Blob>& top) { switch (Caffe::mode()) { case Caffe::CPU: Forward_cpu(bottom, top); break; case Caffe::GPU: Forward_gpu(bottom, top); break; default: LOG(FATAL) << "Unknown caffe mode."; } }

There is no reshape() before real layer to do forward. This change does not support faster-rcnn because this kind of network will reshap top layer by down layer at runtime. I think this feature should be considered. Thanks.

luoyetx commented 6 years ago

All reshape stuffs are done in PlaceMemory. When the shape of input blob changes, all internal blob will change the shape and realloc the memory buffer.

spacegrass commented 6 years ago

Yes. I have read the code. It will reshape every layer of the net BEFORE net forward. Some kind of net like faster-rcnn will change the down layer shape information at forward time. So I think PlaceMemory won't fix it.

luoyetx commented 6 years ago

The layer itself gets all shape info about input blobs, it should be able to computer the shape of output blobs when reshape function called.

luoyetx commented 6 years ago

For proposal layer, checkout code here. We set maximum shape for output rois.

yanhn commented 5 years ago

@luoyetx I got this strange problem when call Net::CopyTrainedLayersFrom under different mode(caffe::GPU vs caffe CPU): Error info is:

C:\workspace\opensource\mini-caffe\src\net.cpp:277: Cannot copy param 0 weights from layer '221'; shape mismatch. Source param shape is 1 64 1 1 (64); target param shape is 64 1 1 1 (64). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.

my last 2 layers has prototxt as below: layer { name: "221" type: "Convolution" bottom: "220" top: "221" convolution_param { num_output: 1 bias_term: true group: 1 pad: 0 kernel_size: 1 stride: 1 dilation: 1 } } layer { name: "output" type: "Sigmoid" bottom: "221" top: "output" }

1:under cpu mode. everything's fine. My conv layer's weights target param has shape 1 x 64 x 1 x 1, with bias shape (1), can correctly load from caffemodel to caffe::Net. 2: under gpu mode, conv layer's target param weights has shape 64 x 1 x 1 x1, with bias shape (64), so I can't load params from caffemodel due to the mismatch between my caffemodel( which is corect) and caffe::Net from prototxt (which is wrong). Have you any idea on this strange problem?

I have tried with add a special case into line 259 of net.cpp, if (source_layer_name == "221") { const bool kReshape = true; target_blobs[j]->FromProto(source_layer.blobs(j), kReshape); printf("after copy proto blob no.%d: shape is %s\n", j, target_blobs[j]->shape_string()); continue; } It helped me with Net::CopyTrainedLayersFrom, but when I do net.Forward(); same shape mismatch problem occurred again. The only difference in code is if I use different mode(caffe::GPU vs caffe CPU).

yanhn commented 5 years ago

Recently I did some code reading and debug. The result shows that: 1、Net::Reshape() was called during Net::Forward, so the shape of my Convolution changed to mismatching status. 2、I put some log info in BaseConvolutionLayer::Reshape, and found other Convolution layers print the corresponding logs except for my last Convolution with kernel_size=1 and num_output=1. ( I only use this conv layer once for feature dimension reduction. ) 3、Test with other models on other computers has the same problem: pycaffe ( gpu & cpu ) ok ; c++ cpu ok; c++ gpu not ok. So is there some advice to help me out? Thanks.