Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
17.3k stars 4.17k forks source link

Synchronization Issue in ML-Agents Decision Timing #6175

Open matinmoezzi opened 1 week ago

matinmoezzi commented 1 week ago

Is your feature request related to a problem? Please describe.
Yes, the problem arises from the desynchronization between the decision time (when Python produces an action) and Academy.EnvironmentStep() (when Unity applies actions). This desync can lead to duplicate actions being taken or duplicate states being read, which is frustrating when trying to maintain consistency and synchronization during training. The issue seems to stem from the non-blocking websocket communication between Unity and Python.

Describe the solution you'd like
I’d like a blocking mechanism in Unity that pauses rendering until the next action from Python is available. This would ensure that Academy.EnvironmentStep() only processes actions and states in sync with the decision-making process, preventing duplicates.

Describe alternatives you've considered

Additional context
This issue is likely caused by the asynchronous nature of websocket-based communication. A blocking mechanism could help bridge the gap between Unity and Python processes. Feedback or alternative solutions from the community would be greatly appreciated.

wlsdn2749 commented 12 hours ago

Howdy!

Have you ever used RequestDecision method instead of Decision Requester Components?

main usage is

public class Test : Agent
...
private bool isShootable;
void FixedUpdate()
{
  if(isShootable)
  {
    RequestDecision();
  }
}

...

The RequestDecision method works exactly the same as the Decision Requester component. However, decisions can be almost perfectly controlled as much as you want, so it will likely be more useful than you think

It follows the same process, such as RequestDecision -> CollectObservation -> OnActionReceived, and so on.

https://docs.unity3d.com/Packages/com.unity.ml-agents@1.0/api/Unity.MLAgents.Agent.html#Unity_MLAgents_Agent_RequestDecision