rlworkgroup / garage

A toolkit for reproducible reinforcement learning research.
MIT License
1.86k stars 309 forks source link

Support Both GPU and CPU With The Ray Sampler #2230

Open avnishn opened 3 years ago

avnishn commented 3 years ago

Was looking into using the ray sampler with the gpu again this week because of its potential use with the evaluation samplers/meta evaluator.

so to recap, what makes it tricky in the current ray sampler is the following: sampler workers are represented by ray actors we give a slice of the gpu to a ray actor. That ray actor holds onto the portion of gpu that we give it until it the actor dies The problem: sampler workers aren’t shut down till the end of training, therefore sampler workers would block on the gpu until they died, meaning that they can’t be used for training.

The solution: ray allows individual functions to be run asynchronously on the gpu as “short lived actors”. remote functions will release the gpu after they are done executing write a sample worker that can use ray but only declares it’s obtain samples function with the ray.remote decorator. This would make it so that we would be able to have gpu workers but they wouldn’t block on gpu resources unless they are sampling.

It would require some writing of a new worker, but the design can be used for both the cpu ray sampler and the gpu ray sampler.

ray with gpus documentation