question about p_reg in p_train

I went through the code and found a problem I didn't understand. I think of p_reg as a regular term, and the regular term as a constraint on the learning parameters. But I found that the act_Pd. flatparam() in the code p_reg = TF.reduce_mean (TF.square (act_pd.flatparam())) gets the network output, that is to say, the return of the flatparam function is not the learning parameters,Instead , It's network output How to explain this regularization.This confuses me and I look forward to your advice. for example of act_Pd. flatparam() :

openai / maddpg

question about p_reg in p_train #46