Open rahul-zomato opened 1 year ago
self._replay_next_target_net_outputs = self._network_adapter( self._replay.states, 'Target')
should be
self._replay_next_target_net_outputs = self._network_adapter( self._replay.next_states, 'Target')
at https://github.com/google-research/recsim/blob/master/recsim/agents/slate_decomp_q_agent.py#L518
@cwhsu-google
Thanks @rahul-zomato was about to totally skip over this bug.
self._replay_next_target_net_outputs = self._network_adapter( self._replay.states, 'Target')
should be
self._replay_next_target_net_outputs = self._network_adapter( self._replay.next_states, 'Target')
at https://github.com/google-research/recsim/blob/master/recsim/agents/slate_decomp_q_agent.py#L518