Open XuVV opened 3 years ago
Hi,
With Batch Normalization, the neural network is not guaranteed to satisfy the Lipschitz condition, but it usually has better performance. In our paper, we name a CNN without BN as "SimpleCNN", with BN as "DnCNN." If a simple CNN is trained by the "real SN" in our paper, it is guaranteed to satisfy the condition given any loss function (L1 or L2). The theories motivate why we design real SN. With some practical techniques beyond the theory (eg. BN), real SN gets better performance.
Thanks for your reply! The paper "How Does Batch Normalization Help Optimization?" shows that the batch-normalized landscape exhibits a better Lipschitz constant. Do you think that's why RealSN-DnCNN has a better performance compared to the SimpleCNN?
The batch normalization layer is an empirical stuff to me when we wrote the paper. The paper you pointed out is interesting and may provide theoretical explanations to BN in PnP settings.
Hi, really amazing work.
I have some questions about the convergence:
Many thanks. Looking forward to your reply!