Currently, weights is a dictionary that maps actions (or behaviors) to weights array, one for each feature, i.e. action -> [feature].
Instead, it must map an action to a feature and, then, to the weight, i.e. action -> feature -> weight. Then, should the features order change from a previous game to another one, where the policy is loaded, it'll map correctly the action to its weight.
Currently,
weights
is a dictionary that maps actions (or behaviors) to weights array, one for each feature, i.e.action -> [feature]
.Instead, it must map an action to a feature and, then, to the weight, i.e.
action -> feature -> weight
. Then, should the features order change from a previous game to another one, where the policy is loaded, it'll map correctly the action to its weight.