aviralkumar2907 / BEAR

Code for Stabilizing Off-Policy RL via Bootstrapping Error Reduction
158 stars 39 forks source link

KeyError: 'data_policy_mean' when run BEAR_IS #3

Closed sweetice closed 4 years ago

sweetice commented 4 years ago

Traceback (most recent call last): File "/home/hq/code/remotepycharmfolder/BEAR-master/main.py", line 209, in pol_vals = policy.train(replay_buffer, iterations=int(args.eval_freq)) File "/home/hq/code/remotepycharmfolder/BEAR-master/algos.py", line 655, in train state_np, next_state_np, action, reward, done, mask, data_mean, data_cov = replay_buffer.sample(batch_size, with_data_policy=True) File "/home/hq/code/remotepycharmfolder/BEAR-master/utils.py", line 40, in sample data_mean = self.storage['data_policy_mean'][ind] KeyError: 'data_policy_mean'

When running BEAR_IS algorithm, that raises this error. Note that your program don't save data_policy_mean and data_policy_logvar in the buffer : )

aviralkumar2907 commented 4 years ago

Yes, for the runs of BEAR_IS we created a custom buffer that stores these. You can generate a buffer based on instructions in https://github.com/rail-berkeley/d4rl and record the mean and logvar for it.

sweetice commented 4 years ago

Thanks for your reply!