I notice in SAC, the function select_action(), function sample() is simply used to randomly sample an "action",
But in function evaluate(), the code is written as batch_mu + batch_sigma*z
Why don't just use sample() as the first one ? Is there any important differences?
I notice in SAC, the function select_action(), function sample() is simply used to randomly sample an "action", But in function evaluate(), the code is written as batch_mu + batch_sigma*z
Why don't just use sample() as the first one ? Is there any important differences?