I went through the code and found a problem I didn't understand.
I think of p_reg as a regular term, and the regular term as a constraint on the learning parameters.
But I found that the act_Pd. flatparam() in the code p_reg = TF.reduce_mean (TF.square (act_pd.flatparam())) gets the network output, that is to say, the return of the flatparam function is not the learning parameters,Instead , It's network output How to explain this regularization.This confuses me and I look forward to your advice.
for example of act_Pd. flatparam() :
I went through the code and found a problem I didn't understand.
I think of p_reg as a regular term, and the regular term as a constraint on the learning parameters.
But I found that the act_Pd. flatparam() in the code p_reg = TF.reduce_mean (TF.square (act_pd.flatparam())) gets the network output, that is to say, the return of the flatparam function is not the learning parameters,Instead , It's network output How to explain this regularization.This confuses me and I look forward to your advice.
for example of act_Pd. flatparam() :
![image](https://user-images.githubusercontent.com/43668853/83891381-a6d6c680-a77f-11ea-85c6-ed33c06dc913.png)