Don't merge this branch quite yet - wanted to get some feedback on some policy refactoring I did. Big changes:
Changed the num_layers, hidden_size args to a tuple (like in Pedro's old deep-rl code), as I figured we don't really use this as a hyperparameter; I probably shouldn't have changed this in the first place (oops).
Renamed GaussianMLP_Actor to Gaussian_FF_Actor
Renamed FF_Critic to FF_Q, and GaussianMLP_Critic to FF_V.
Merged FF_Actor and Scaled_FF_Actor, as they are functionally identical with the max_action arg
Added LSTM_Q and LSTM_V to critic.py
If any of these changes present a problem, let me know. I'm also open to changing the renaming scheme for GaussianMLP_X.
Don't merge this branch quite yet - wanted to get some feedback on some policy refactoring I did. Big changes:
Changed the num_layers, hidden_size args to a tuple (like in Pedro's old deep-rl code), as I figured we don't really use this as a hyperparameter; I probably shouldn't have changed this in the first place (oops).
Renamed GaussianMLP_Actor to Gaussian_FF_Actor
Renamed FF_Critic to FF_Q, and GaussianMLP_Critic to FF_V.
Merged FF_Actor and Scaled_FF_Actor, as they are functionally identical with the max_action arg
Added LSTM_Q and LSTM_V to critic.py
If any of these changes present a problem, let me know. I'm also open to changing the renaming scheme for GaussianMLP_X.