Open vmoens opened 1 year ago
I agree that building the models can be a bit tedious for new users. Probably the most difficult module to create until you get the hang of it.
A bit like having the trainer, having pre-built architectures can help lower the entry barrier. Also, it can be handy to quickly put together examples. However, not sure if trying to cover all possible aechitectures is necessary.
For more advanced users I think defining the architectures is fine.
I personally would like to do them manually, so I can make sure I understand the full pipeline of key_a => key_b = key_c => ....
.
I think this shortcut is fine as long as it is just chaining modules together without any custom logic. I.e. DQNActor
is equivalent to Sequence(TDModule(MLP, in_keys=["a"], out_keys=["b"]), QValueModule(in_keys=["b"]))
. I would be less happy if there was custom forward logic, because I would worry about losing the ability to chain things together in a nonstandard manner.
Also is there a lazy init for the LSTMModule
? I'm not sure there is, which would mean you need to pass the input size. Furthermore, how would you specify the hidden sizes or activations for the MLP/CNN/etc?
The modules would still be configurable through some config. One way I like to do things is
config = TanhActor.get_default_config(Architectures.LSTM) # gives you a dict or similar with all the default values
actor = TanhActor(config, backbone=Architectures.LSTM)
The context is that many users that are less experienced than you want to build quickly their models "as anyone else would". A sort of batteries included.
What I'm trying to see is how we can have a lib that is both modular for researchers in RL like you and has some very high level entry points for those who only want to use prenuilt RL solutions for their problem.
I overall agree that it's important to have these pre-built models like there are traininers. My points, which I think are similar to what has been brought up are:
I'm also not totally convinced this is strictly necessary for new users. If we look at the torch
library, there is not even an MLP
class. I think having really good examples and community support would go a long way. Torch also has a lot of boilerplate, but I think it aids in readability and forces the users to have some understanding.
Motivation
It can be unnecessary tedious to write actors (and value) models. A possible solution would be to have pre-built TensorDictSequence subclasses. Here are a few examples:
The "in-features" will be determined dynamically using lazy modules.
Thoughts?
cc @matteobettini @albertbou92 @BY571 @smorad