Closed X-DDDDD closed 1 year ago
Hi @X-DDDDD
Thank you for using our Python API. The TerminalSteps contains only information about agents that have reached a termination condition (i.e. EndEpisode() called in C#). The information for the parallel agents that have not reached a termination condition will be contained in the DecisionSteps object.
Are you seeing behavior that is different from this? Please share as much detail as you can.
Hi @X-DDDDD
Thank you for using our Python API. The TerminalSteps contains only information about agents that have reached a termination condition (i.e. EndEpisode() called in C#). The information for the parallel agents that have not reached a termination condition will be contained in the DecisionSteps object.
Are you seeing behavior that is different from this? Please share as much detail as you can.
I'm sorry I didn't reply to you until now.Thank you for your advice!
I will communicate with you in time when I make some progress or confusion.
Hey guys, can you provide some advice on this, I cannot use this API with multiple scenes either?
Hey guys, can you provide some advice on this, I cannot use this API with multiple scenes either?
A lot of people have this problem now.🤣
Hi @Abror1997
Can you open a new issue with more information about your issue so that I can give you better information?
Or @X-DDDDD if you are also having this problem, feel free to open an issue as well.
HI @X-DDDDD the most reliable way to test your custom algorithm is only to use one agent inside one build instance. Being in need of multiple agents/environments, you have to run multiple build instances concurrently.
The Python API is not very intuitive when it comes to handling multiple agents. The sum of DecisionSteps and TerminalSteps is not guaranteed to equal your total agent count, even though if all agents share the same behavior.
@andrewcoh Correct me if I'm wrong.
Hello, ML agents developer! I'm very happy to use your training tools,but I have some problems in using them at present.
I'm trying to use Python-API.md to test my RL algorithm,but when I build the example containing 12 scenes into an executable file according to Learning-Environment-Executable.md, I encounter difficulties in getting the state input in the algorithm. Terminalsteps cannot return all the observation, reward and other information of the 12 scenes. It will only return information about a scene(In 12 scenes, other senses that are not triggered will also be reset) that goes to Terminalsteps.
In other words,according to the interface provided by Python-API.md, when the ball of one scene falls, all 12 scenes will be reset, so I can't carry out parallel training(Even if the ball of one scene falls, other scenes are still in progress as in unity).
So if I want to use my own algorithm to train multiple scenes(an agent) in the Example Learning Environments, how should I use your tools?
Thanks! :)