Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
17.21k stars 4.16k forks source link

ML-agents python low-level API #5564

Closed X-DDDDD closed 1 year ago

X-DDDDD commented 3 years ago

Hello, ML agents developer! I'm very happy to use your training tools,but I have some problems in using them at present.

I'm trying to use Python-API.md to test my RL algorithm,but when I build the example containing 12 scenes into an executable file according to Learning-Environment-Executable.md, I encounter difficulties in getting the state input in the algorithm. Terminalsteps cannot return all the observation, reward and other information of the 12 scenes. It will only return information about a scene(In 12 scenes, other senses that are not triggered will also be reset) that goes to Terminalsteps.

In other words,according to the interface provided by Python-API.md, when the ball of one scene falls, all 12 scenes will be reset, so I can't carry out parallel training(Even if the ball of one scene falls, other scenes are still in progress as in unity).

So if I want to use my own algorithm to train multiple scenes(an agent) in the Example Learning Environments, how should I use your tools?

Thanks! :)

andrewcoh commented 3 years ago

Hi @X-DDDDD

Thank you for using our Python API. The TerminalSteps contains only information about agents that have reached a termination condition (i.e. EndEpisode() called in C#). The information for the parallel agents that have not reached a termination condition will be contained in the DecisionSteps object.

Are you seeing behavior that is different from this? Please share as much detail as you can.

X-DDDDD commented 3 years ago

Hi @X-DDDDD

Thank you for using our Python API. The TerminalSteps contains only information about agents that have reached a termination condition (i.e. EndEpisode() called in C#). The information for the parallel agents that have not reached a termination condition will be contained in the DecisionSteps object.

Are you seeing behavior that is different from this? Please share as much detail as you can.

I'm sorry I didn't reply to you until now.Thank you for your advice!

I will communicate with you in time when I make some progress or confusion.

akhalinem commented 3 years ago

Hey guys, can you provide some advice on this, I cannot use this API with multiple scenes either?

X-DDDDD commented 3 years ago

Hey guys, can you provide some advice on this, I cannot use this API with multiple scenes either?

A lot of people have this problem now.🤣

andrewcoh commented 3 years ago

Hi @Abror1997

Can you open a new issue with more information about your issue so that I can give you better information?

Or @X-DDDDD if you are also having this problem, feel free to open an issue as well.

MarcoMeter commented 2 years ago

HI @X-DDDDD the most reliable way to test your custom algorithm is only to use one agent inside one build instance. Being in need of multiple agents/environments, you have to run multiple build instances concurrently.

The Python API is not very intuitive when it comes to handling multiple agents. The sum of DecisionSteps and TerminalSteps is not guaranteed to equal your total agent count, even though if all agents share the same behavior.

@andrewcoh Correct me if I'm wrong.