stheid / safety_guarded_rl

Other
0 stars 0 forks source link

Safe Initialization #5

Closed stheid closed 3 years ago

stheid commented 3 years ago

Currently the random initialization might leed to initial states that cannot be controlled safely with the constraints on the action space. Hence we came up with the idea to validate that an LQR can safely control the system.

The validation can be done in an initial step, where we analytically look at how the state space is consecutively transformed by the LQR and than backpropagate which initial states should not be allowed.

This makes sure that there exists a safe policy to begin with.

stheid commented 3 years ago

Absolute safety is not actually needed. A good intitialization that meets some empirical bound is sufficient, because than we can at least guarantee that we do not decrease our safety. This kind of monotonic safety is more general and also more realistic.

Hence, the already tested imitation learning will be sufficiently solve the initialization