Keep getting Invalid one hot action error

Mehooz / BIRD_code

Code for paper "Bridging Imagination and Reality for Model-Based Deep Reinforcement Learning".

14 stars 2 forks source link

I tried a fresh environment with the requirements specified. Since I'm on windows10, dm-control has some issue with installing so I go with open-AI gym and the provided atari options. But it seems like I could never get past the value error "Invalid one-hot action", which is raised from the file wrapper.py and on like from 359 to 362. Could you explain why we need a one-hot encoded action to be np.allclose to the original action? I couldn't quite figure out the purpose.

I tried just comment out this section and let the code run, but just after training for around 1 million steps, I notice the error isn't quite right, as the image loss and model loss are basically unchanged throughout the training process, all entropy loss term are showing infinity. So I'm just assuming this is because the action part, but I couldn't quite figure it out.

Mehooz / BIRD_code

Keep getting Invalid one hot action error #1