Kaixhin / Rainbow

Rainbow: Combining Improvements in Deep Reinforcement Learning
MIT License
1.56k stars 282 forks source link

Use numpy arrays for speed #35

Closed Kaixhin closed 5 years ago

Kaixhin commented 5 years ago

@ahundt I noticed that you made some adjustments that apparently improve the speed of this code. Would you be willing to contribute these back? I had some questions for memory.py:

ahundt commented 5 years ago

I did start trying a few things. Actually I looked around and you'd probably be even better off simply directly importing the openai baselines priority queue, or modeling your code off of it as they don't require TF. It stores all the data in several separate and contiguous numpy arrays which will likely be much, much faster than the tuples you currently use.

Here is the openai baselines segment tree part: https://github.com/openai/baselines/blob/master/baselines/common/segment_tree.py

openai baselines prioritized replay buffer: https://github.com/openai/baselines/blob/7859f603cd4df3b647318f2f6e3e68555ea8d4d8/baselines/deepq/replay_buffer.py#L71

If you move my commits over would you mind doing a git "cherry-pick" so I can have the commit credit for that specific change? You're almost certainly right about the type problems, I just started tweaking things to learn. The array type changes make runtime better, but I ran into some weirdness that may be due to the int -> float discrepancy. No other changes have proven useful at this point.

ahundt commented 5 years ago

The only way to get final answers for your 3 questions is to run a profiler before/after each one, but I'd guess that numpy has a very good chance of being faster, but I wouldn't be surprised if the baselines version is another 10x faster because of the contiguous arrays plus they avoid recursion which is expensive. Though they might have room for improvement too. :-)

Kaixhin commented 5 years ago

Didn't know about git cherry-pick - thanks! Have now picked the relevant commits from your repo, then reverted a few things like the max above.

Good call on the performance of Baselines. This repo is more about readability than performance, but the contiguous arrays should be easy enough to add with only a tiny bit more complexity. However, I will leave that for another time (or someone else).