[Feature Request] Add Attention/transformer nets (the GTrXL model in particular)

DLR-RM / stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

https://stable-baselines3.readthedocs.io

MIT License

9.22k stars 1.71k forks source link

[Feature Request] Add Attention/transformer nets (the GTrXL model in particular) #177

Closed rrfaria closed 3 years ago

rrfaria commented 4 years ago

A little explannation about what is transformer: https://en.wikipedia.org/wiki/Transformer_(machine_learning_model)#:~:text=The%20Transformer%20is%20a%20deep,as%20translation%20and%20text%20summarization.

You can find an example on RLLIB

https://docs.ray.io/en/latest/rllib-models.html#attention-networks https://github.com/ray-project/ray/blob/master/rllib/examples/attention_net.py

The paper about it: https://arxiv.org/pdf/1910.06764.pdf

I would like to help you implement this but I don't have knowledge enough to help yet

What I know about it It is used more in NLP but some researchers started to use it in different field such RL. Some article saying it performs better than lstm.

Miffyli commented 4 years ago

Custom networks like these would be nice to experiment with, but won't be considered before first releases. This sounds like a good addition to a contrib repo we have been considering.

araffin commented 4 years ago

I totally agree, the place for such network is the contrib repo ;)

ManifoldFR commented 4 years ago

Would implementations of other not-so-mainstream algorithms (such as MPO or AWAC) also go to the contrib repo ?

araffin commented 4 years ago

Would implementations of other not-so-mainstream algorithms (such as MPO or AWAC) also go to the contrib repo ?

MPO would be a good fit (and it is quite complex), @Miffyli is currently writing a contribution guide for the contrib repo, so we keep it clean and functional.

for AWAC, I plan to write some wrapper to use https://github.com/takuseno/d3rlpy repo (which has a nice interface and other offline RL implementations) see https://github.com/takuseno/d3rlpy/issues/5

araffin commented 3 years ago

Closing this issue as the contrib repo is now live: https://github.com/Stable-Baselines-Team/stable-baselines3-contrib (I also created an issue for MPO there)