[REQUEST] Support for Recurrent Neural Network variant of RL (DDPG/TD3) - POMDP

itsmesaisatish commented 2 years ago

Hi @takuseno , Thanks for this amazing repo and its really helpful and really appreciate your efforts .

I could see key algorithms related to discrete and continuous action space are covered already and I see lot of scope for enhancements .

Is there any plan to support Recurrent Neural Network(LSTM) variant of RL (DDPG/TD3) these really helps when sometimes the agent needs to remember past information, that was temporarily available, for future action decision. mostly like partially-observed Markov Decision process (POMDP )

Below are the few interesting papers from Google Deepmind and Northeastern University in the similar lines.

a. Google Deepmind ( https://rll.berkeley.edu/deeprlworkshop/papers/rdpg.pdf ), and b. Northeastern University https://arxiv.org/pdf/2110.12628.pdf

Regards, Sai

takuseno commented 2 years ago

@itsmesaisatish Hi, thanks for the issue. I'm working on refactoring data sampling logics to support Transformer architecture. Once we have it, RNN architecture could be also supported. Here is the relevant issue in the past: https://github.com/takuseno/d3rlpy/issues/214 .

itsmesaisatish commented 2 years ago

Thanks for the update

takuseno / d3rlpy

[REQUEST] Support for Recurrent Neural Network variant of RL (DDPG/TD3) - POMDP #216