Offline version of AWR - Githubissues

Hi, I am trying to modify AWR into the offline version (or fully off-policy version). I find that the paper states that one can simply treat the dataset as the replay buffer and don't need to do any modifications. But I notice that if I remove sampling in rl_agent.train, line 105 in rl_agent.py: train_return, train_path_count, new_sample_count = self._rollout_train(self._samples_per_iter), new_sample_count will remain 0, so that update steps are also 0.

Would you like to point out a proper way of modifications to obtain the offline AWR?

xbpeng / awr

Offline version of AWR #5