Kaixhin / Rainbow

Rainbow: Combining Improvements in Deep Reinforcement Learning
MIT License
1.56k stars 282 forks source link

Noob Question : Need help running the code. It seems to be running for forever. #61

Closed YousufAzadSami closed 4 years ago

YousufAzadSami commented 4 years ago

Hello good people!

I didn't know where else to post, so I am posting here.

Background : First of all, I am out of my elements here. I am just learning about RL. I got a job on it. It's more code oriented task but I need some concepts as well. I decided to throw myself in the water to break my stagnation. And I am struggling a bit, but that was the idea. I would like to understand the concepts eventually by myself but for the job I need to press on right now. I hope you can help me here.

Issue : When I run it with default arguments it just keep running. I think by default it is set to run 5 million episodes(T-max = 50e6). I want to run one successful run before I start playing with it so I have an idea on what the result is supposed to look like. Should I just change the T-max variable? There are about 20 more arguments and I am not sure if it affects other or not. For example, I think the target-update and learn-start are related to this. And since my concepts are not so clear, I could use some help here.

I hope I was clear, if not please ask me here.

Kaixhin commented 4 years ago

If you want to get some quick results, you can run this code with the arguments provided in the README for "data-efficient Rainbow". T-max is the number of steps (not episodes) in the environment.

If you are just learning about RL, this is not the right codebase/algorithm for you to be working with - it involves the combination of several research papers and assumes familiarity with them. There's plenty of other code out there centred around teaching RL, such as OpenAI's Spinning Up in Deep RL.