BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.09k stars 18.7k forks source link

Fix issue #6769 in Reshape layer implementation #6770

Open CorvusCorax opened 5 years ago

CorvusCorax commented 5 years ago

see #6769

The way the Reshape layer is currently implemented (sharing gradient in the Reshape() function and providing empty stubs for Forward() and Backward() functions) differs from the way Flatten is implemented, where the gradient and data are declared as shared during every call of the forward()/backward() implementation.

Changing this for Reshape layer in the same way it is implemented for the older Flatten layer seems to make the issue observed in #6769 disappear.

It should be noted that this has been implemented without properly understanding why the previous implementation failed in the first case, just based on the example provided by the Flatten layer.

A much simpler change, just changing

top[0]->ShareDiff(*bottom[0]);

to the more intuitive

bottom[0]->ShareDiff(*top[0]);

but leaving it in the Reshape function did NOT fix the issue