Kaixhin / Rainbow

Rainbow: Combining Improvements in Deep Reinforcement Learning
MIT License
1.56k stars 282 forks source link

disable env.reset() after every episode #66

Closed zyzhang1130 closed 4 years ago

zyzhang1130 commented 4 years ago

Hi, May I check if I would like to keep the environment as it is after each training episode, should I just comment line line 147 in main.py or should I also comment line 130? Besides what am I supposed to do if I just want to reset the agent's position but keep the environment as it is after each training episode?

Thank you.

Kaixhin commented 4 years ago

Commenting out line 147 would prevent the environment resetting during training. Commenting out line 130 would prevent the environment resetting during collection of data for validating Q-values. It seems that you might need to write a different environment and use a different set of functions if you want more control.

zyzhang1130 commented 4 years ago

In general validating Q-values and training should be consistent right (if one is reset another should also)

Kaixhin commented 4 years ago

Yes it makes sense to keep them consistent.

zyzhang1130 commented 4 years ago

when I commented the reset at line 130 as well, it gave me this error: AttributeError: 'NoneType' object has no attribute 'metadata'

when it ran line 132: nextstate, , done = env.step(np.random.randint(0, action_space))

Does this line just let agent make a random movement and what could be the possible reasons for the error?

Kaixhin commented 4 years ago

That line uses random actions to collect data for validating Q-values. I'm not sure why your edit is causing the error, so you will need to try and debug it yourself.