SforAiDl / genrl

A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL
https://genrl.readthedocs.io
MIT License
403 stars 59 forks source link

CNN parameters not being updated for ACs #315

Closed hades-rp2010 closed 3 years ago

hades-rp2010 commented 4 years ago

The weights of the CNN layers dont get updated in the CNN AC classes. Formally, this is what I mean -

The CNN actor critic class -

class CNNActorCritic(BaseActorCritic):
    def __init__(args)
        super(CNNActorCritic, self).__init__()
        self.feature, output_size = cnn((framestack, 16, 32))
        self.actor = MlpPolicy(
            output_size, action_dim, policy_layers, discrete, **kwargs
        )
        self.critic = MlpValue(output_size, action_dim, val_type, value_layers)

    def get_action(
        self, state: torch.Tensor, deterministic: bool = False
    ) -> torch.Tensor:
        state = self.feature(state)
        state = state.view(state.size(0), -1)

        action_probs = self.actor(state)
        action_probs = nn.Softmax(dim=-1)(action_probs)
        # Som lines deleted
        return action, distribution

    def get_value(self, inp: torch.Tensor) -> torch.Tensor:
        inp = self.feature(inp)
        inp = inp.view(inp.size(0), -1)

        value = self.critic(inp).squeeze(-1)
        return value

Above, self.feature is the nn.Sequential object to be used (The CNN) But the optimizers in the agents -

self.optimizer_policy = opt.Adam(self.ac.actor.parameters(), lr=self.lr_policy)
self.optimizer_value = opt.Adam(self.ac.critic.parameters(), lr=self.lr_value)

The self.ac.actor.parameters() and self.ac.critic.parameters() are just the params of the actor head and the critic head. So what about self.ac.feature.parameters()? That does not seem to be updated anywhere

Sharad24 commented 4 years ago

I think feature params should be added to both. Not sure though

hades-rp2010 commented 4 years ago

Yeah, checked by printing out the weights. I'll add this too in my current PR. I'll close this for now?

Sharad24 commented 4 years ago

Raise a separate PR, if possible.

On 03-Sep-2020, at 8:00 PM, hades-rp2010 notifications@github.com wrote:

Yeah, checked by printing out the weights. I'll add this too in my current PR. I'll close this for now?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/SforAiDl/genrl/issues/315#issuecomment-686531686, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH72FJ66OIR6D7F2TQ7QTFLSD6R6RANCNFSM4QS3A3RQ.

hades-rp2010 commented 3 years ago

307 Takes care of this

Closing this for now