Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
17.27k stars 4.17k forks source link

Please provide sufficient documentation about deterministic feature #5859

Closed trsh closed 1 year ago

trsh commented 1 year ago

We have this: "Whether to select actions deterministically during inference from the provided neural network." (in https://docs.unity3d.com/Packages/com.unity.ml-agents@2.3/api/Unity.MLAgents.Policies.BehaviorParameters.html) and that's it. There is not information on what it actually does and how it works. Not even in few sentences.

I found this in changelog:

Deterministic action selection is now supported during training and inference(#5619)
Added a new --deterministic cli flag to deterministically select the most probable actions in policy. The same thing can be achieved by adding deterministic: true under network_settings of the run options configuration.(#5597)
Extra tensors are now serialized to support deterministic action selection in onnx. (#5593)
Support inference with deterministic action selection in editor (#5599)

However this hyper parameter is not documented in https://unity-technologies.github.io/ml-agents/ like others, nicely with explanation. Also its not clear if I need to train with deterministic to use it in inference?

amagwka commented 1 year ago

In reinforcement learning (RL), deterministic action selection refers to the process of selecting actions based on a deterministic policy. A deterministic policy is a function that maps states to actions, such that for a given state, the policy always returns the same action. It actually choose action with the highest probability from action probabilities.

However, deterministic action selection can also have some drawbacks. For example, it may be less effective in environments where there is significant uncertainty or stochasticity, as it may be difficult to determine a single optimal action in these cases.

Related discussions:

2643

2112

trsh commented 1 year ago

@amagwka thank you. I know the theory, but my point here is that ml-agents documentation lacks information about this. Especially the undocumented hyperparameter.

trsh commented 1 year ago

@miguelalonsojr I don't think this is complete? Or can you share how?

trsh commented 10 months ago

@miguelalonsojr this is still actual. I looked up the docs just today. Can we reopen?