pytorch / rl

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
https://pytorch.org/rl
MIT License
2.22k stars 292 forks source link

[Feature Request] Prebuilt models #1425

Open vmoens opened 1 year ago

vmoens commented 1 year ago

Motivation

It can be unnecessary tedious to write actors (and value) models. A possible solution would be to have pre-built TensorDictSequence subclasses. Here are a few examples:

from torchrl.modules import Architectures

# builds an actor with a TanhNormal distribution with [-1, 1] bound
actor = TanhActor(
   out_features=4,
   backbone=Architectures.MLP
)
# builds an actor with a Categorical distribution with 8 degrees of freedom bound
actor = CategoricalActor(
   out_features=4,
   backbone=Architectures.MLP
)
# builds an actor with a Categorical distribution for pixels with 8 degrees of freedom bound
actor = CategoricalActor(
   out_features=4,
   backbone=Architectures.CNN
)
# builds an actor with a Categorical distribution with a default LSTM with 8 degrees of freedom bound
actor = CategoricalActor(
   out_features=4,
   backbone=Architectures.LSTM
)
# builds an DQN for 8 actions
actor = DQNActor(
   out_features=4,
   backbone=Architectures.MLP
)

The "in-features" will be determined dynamically using lazy modules.

Thoughts?

cc @matteobettini @albertbou92 @BY571 @smorad

albertbou92 commented 1 year ago

I agree that building the models can be a bit tedious for new users. Probably the most difficult module to create until you get the hang of it.

A bit like having the trainer, having pre-built architectures can help lower the entry barrier. Also, it can be handy to quickly put together examples. However, not sure if trying to cover all possible aechitectures is necessary.

For more advanced users I think defining the architectures is fine.

smorad commented 1 year ago

I personally would like to do them manually, so I can make sure I understand the full pipeline of key_a => key_b = key_c => .....

I think this shortcut is fine as long as it is just chaining modules together without any custom logic. I.e. DQNActor is equivalent to Sequence(TDModule(MLP, in_keys=["a"], out_keys=["b"]), QValueModule(in_keys=["b"])). I would be less happy if there was custom forward logic, because I would worry about losing the ability to chain things together in a nonstandard manner.

Also is there a lazy init for the LSTMModule? I'm not sure there is, which would mean you need to pass the input size. Furthermore, how would you specify the hidden sizes or activations for the MLP/CNN/etc?

vmoens commented 1 year ago

The modules would still be configurable through some config. One way I like to do things is

config = TanhActor.get_default_config(Architectures.LSTM)  # gives you a dict or similar with all the default values
actor = TanhActor(config, backbone=Architectures.LSTM)

The context is that many users that are less experienced than you want to build quickly their models "as anyone else would". A sort of batteries included.

What I'm trying to see is how we can have a lib that is both modular for researchers in RL like you and has some very high level entry points for those who only want to use prenuilt RL solutions for their problem.

matteobettini commented 1 year ago

I overall agree that it's important to have these pre-built models like there are traininers. My points, which I think are similar to what has been brought up are:

smorad commented 1 year ago

I'm also not totally convinced this is strictly necessary for new users. If we look at the torch library, there is not even an MLP class. I think having really good examples and community support would go a long way. Torch also has a lot of boilerplate, but I think it aids in readability and forces the users to have some understanding.