Open mibrahimy opened 5 years ago
PReLU keeps original values when inputs are positive, and multiplies a scale factor (the factors are trained previously ) when inputs are negative, so you can use ReLU( 1*x ) + Scale_2( ReLU(Scale_1(x))) where Scale_1 multiplies -1 to x to make the original negative values to be positive, and sequential ReLU operation keeps these values. Scale_2 multiplies scale factors which are trained previously, note that the scale factors are multiplied -1 before written into weight file, which guarantees the original negative parts to be negative again.
Thank you for the insight. @PKUZHOU
Hi @PKUZHOU, Can you provide the script of transforming PReLU into the combination of ReLu, Scaling, and Elemwise-Add? I think it will be very helpful to generalize this method to the other network to attain higher performance!
Hi @PKUZHOU, Can you provide the script of transforming PReLU into the combination of ReLu, Scaling, and Elemwise-Add? I think it will be very helpful to generalize this method to the other network to attain higher performance!
Sorry, the script is so simple that I didn't keep it after converting MTCNN weights, and I haven't done CNN acceleration jobs for a long time, so I have no plan to rewrite a script. In fact if you are familiar with the data format and layout of caffe models, it is quite easy to write a script to convert Prelu under the way of what I have explained.
Okay! I had implemented it in python to replace the PRelu layer in prototxt and copy weight to caffemodel with one command. Will place onto once clean up unrelated code.
@kketernality Can you share your converted code?Thanks
@xiexuliunian Made a gist for the PReLU conversion, since I needed to do it too : https://gist.github.com/Helios77760/c1317a3f791617c5dbc8cdce071c9576
@Helios77760 Thanks.
@Helios77760 thanks for your code, but when i run this script, i got this "Cannot copy param 0 weights from layer 'PReLU_1'; shape mismatch. Source param shape is (1); target param shape is 32 (32). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer." do you have any suggestions?thanks
Can you explain how is Scaling, applying ReLU and then again Scaling and elementwise addition equivalent to PReLU?