tensortrade-org / tensortrade

An open source reinforcement learning framework for training, evaluating, and deploying robust trading agents.
https://discord.gg/ZZ7BGWh
Apache License 2.0
4.53k stars 1.02k forks source link

Feature Request: TensorForce Multi-GPU or Distributed Training #26

Closed forhonourlx closed 3 years ago

forhonourlx commented 5 years ago

Hi Adam,

This is a great work. Does TensorTrade Support Multi-GPU or Distributed Training? Is there any plan to implement?

Thanks.

forhonourlx commented 5 years ago

Hi Adam,

I notice there are some distributed RL frameworks, like: https://github.com/facebookresearch/torchbeast https://github.com/deepmind/scalable_agent

Any Plan to support scaled works? That would break through the limits.

notadamking commented 5 years ago

@forhonourlx yes, though no distributed frameworks or multi-gpu agents have been implemented. Is this something you would be interested in contributing?

123mitnik commented 4 years ago

As mentioned in the discord TT channel by @MetalMind "JP Morgan is making use of RAY"

this is confirmed inside the document - https://arxiv.org/pdf/1811.09549.pdf

authors Vangelis Bacoyannis, Vacslav Glukhov, Tom Jin, Jonathan Kochems, Doo Re Song @jpmorgan

We found Ray RLlib useful. It is built from the ground up with distributed reinforcement learning in mind. Its foundation rests on a solid infrastructure which leverages task parallel and actor model [Agha and Hewitt, 1987] programming patterns, i.e. programming paradigms which have proven to be very successful in designing efficient, large scale distributed computing systems [Armstrong, 2010]. .... Ray’s design [Moritz et al., 2017] also addresses fault-tolerance. In general, versatile and efficient tools to improve productivity, such as easy-to-use and low-overhead monitoring and profiling of RL training are must haves. From a computational performance viewpoint, another challenge for RL algorithms is choosing appropriate implementations for a task based on the available compute resources in order to ensure the fastest global convergence of an algorithm. Making use of resources such as multi-core CPUs, GPUs, and TPUs optimally is challenging. Ray partially addresses this through its resource aware scheduler. It allows the user to state resource requirements, such as the number of CPUs, GPUs, or custom resources, as code annotations. This can be used to fine tune the computational performance of tasks at a high-level without the need for the user to understand or intervene in the task scheduling.

avacaondata commented 4 years ago

I have some experience with ray for Reinforcement Learning and I think it's probably the best way to go. I'm interested in helping with this and will begin using TensorTrade, trying to implement it in a distributed way through ray. If I'm able to do it I'll send a PR, if you're still interested in this feature. I think it's something useful for all of us, as training sequentially diminishes the power of RL (not only in terms of time, but specially in terms of performance).