I see that in the actor critic model(model.py) it outputs the mu and logstd as an output. In the code, logstd is fixed to 0 by defining it "logstd = torch.zeros_like(mu)" making the standard deviation fixed to 1. But as far as I know it should return the logstd which is also learned by the network(in this case logstd would be the output of some layer). Is there any reason for this behavior?
I see that in the actor critic model(model.py) it outputs the mu and logstd as an output. In the code, logstd is fixed to 0 by defining it "logstd = torch.zeros_like(mu)" making the standard deviation fixed to 1. But as far as I know it should return the logstd which is also learned by the network(in this case logstd would be the output of some layer). Is there any reason for this behavior?