BN/GN degrades model performance at evaluation/test stage

zjsong commented 3 years ago

Hi Haiguang @feiyuelankuang and Kuan @FlappyDoraemon ,

Thanks for sharing the code of this brilliant work on predictive coding.

When I tried to reimplemented the global version of PCN, I found that the BN or GN could accelerate training process remarkably. However, with the evaluation mode (i.e., net.eval()) to inspect training state, the model performance cannot be improved as time goes on.

This is really weird, because without using BN and GN, the same model would work as expected all the time. I guess the general setting of BN and GN may need to be modified at PCN's evaluation/test stage.

So did you suffer from the similar problem when using BN/GN in PCN?

Many thanks in advance.

xxyh1993 commented 6 months ago

I agree with u. In my PCM, when I use BN, the model don‘t work.

zjsong commented 6 months ago

@xxyh1993 In order to mitigate the negative effect of the original BN on the PCM's performance, I instead adopted the BN before every non-linearity at each layer and at each iterative time step.

This implementation can be found in the code of our sound source localization work SSPL, as well as our visually guided sound source separation work AVPC. Please refer to the contents therein.

libilab / PCN-with-Global-Recurrent-Processing

BN/GN degrades model performance at evaluation/test stage #2