Rajesh-Siraskar / Reinforcement-Learning-for-Control-of-Valves

This project uses DDPG for "optimal" control of non-linear valves. Uses MATLAB and Simulink
66 stars 11 forks source link

Question about the network design of agent #6

Open cantonherman opened 1 month ago

cantonherman commented 1 month ago

Mr Siraskar,

Hello ! I have read your paper and your code, which are very helpful to me, thank you for sharing your valuable work to us !

I have a question : Why you designed the actor network with one full connected layer and 3 weight, total learnable parameters is 16. However, the critic network is relative large, with 1.5k learnable parameters. Could you please tell me the philosophy of this unbalance actor-critic design ?

LIN Xuan

Rajesh-Siraskar commented 1 month ago

Hello Lin,

Thank you for your email. As you might know - the Critic network is built to estimate the value function (i.e. either the simpler state-value 'V' or the action-value 'Q'. While the Actor network is meant to update the policy distribution in the direction suggested by the Critic. The Critic therefore is doing the heavy "brain" work - for e.g. in chess it is doing the actual "thinking" part. The Actor is "following" the critic's suggestions.

That is why one finds that often the Critic network needs to be more complicated.

Hope this helps. And wish you all the best for your research.

regards, Rajesh

On Fri, Jun 14, 2024 at 6:41 PM LIN Xuan @.***> wrote:

Mr Siraskar,

Hello ! I have read your paper and your code, which are very helpful to me, thank you for sharing your valuable work to us !

I have a question : Why you designed the actor network with one full connected layer and 3 weight, total learnable parameters is 16. However, the critic network is relative large, with 1.5k learnable parameters. Could you please tell me the philosophy of this unbalance actor-critic design ?

LIN Xuan

— Reply to this email directly, view it on GitHub https://github.com/Rajesh-Siraskar/Reinforcement-Learning-for-Control-of-Valves/issues/6, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASKXGUZEZCTLDCEU4MCYCZDZHLTZDAVCNFSM6AAAAABJKKXA3WVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2TGMZVGI3TENY . You are receiving this because you are subscribed to this thread.Message ID: <Rajesh-Siraskar/Reinforcement-Learning-for-Control-of-Valves/issues/6 @github.com>