What is the use of the attribute 'state' in the RecEnv from the single item recommender tutorial?

facebookresearch / Pearl

A Production-ready Reinforcement Learning AI Agent Library brought by the Applied Reinforcement Learning team at Meta.

MIT License

2.6k stars 157 forks source link

class RecEnv(Environment): def __init__( self, actions: List[torch.Tensor], model: nn.Module, history_length: int ) -> None: ... self.state: torch.Tensor = torch.zeros((self.history_length, 100)).to(device) ...

Hi Daniel, Why do you say it is never used? When I search for self.state, I see: The state is initialized with zeros. It is used to form state_batch, which is used to determine the reward in

reward = self.model(state_batch, action_batch) * 3

It is also gradually updated, upon each step, with the most recent action. After history_length steps, the zeros will have been overwritten and self.state it will be a record of the history_length last taken actions. This is the information used by self.model to compute the reward. I hope that helps. Rodrigo

facebookresearch / Pearl

What is the use of the attribute 'state' in the RecEnv from the single item recommender tutorial? #97