In most isaacgymenvs, CNN = Identity()
There are many cases where LSTM might be helpful, but could hurt training stability. Thus, we add a skip connection around the LSTM with concat_output: True, which creates a skip connection around the LSTM so that it can be bypassed for learning behavior without temporal dependencies
MLP w/ skip connection => LSTM => Value/Policy (already part of RL_games)
Architecture is typically either:
In most isaacgymenvs, CNN = Identity() There are many cases where LSTM might be helpful, but could hurt training stability. Thus, we add a skip connection around the LSTM with
concat_output: True
, which creates a skip connection around the LSTM so that it can be bypassed for learning behavior without temporal dependenciesMLP w/ skip connection => LSTM => Value/Policy (already part of RL_games)
MLP w/ skip connection => LSTM w/ skip connection => Value/Policy (new)
LSTM w/ skip connection => MLP => Value/Policy (new)