Closed Erethan closed 5 years ago
CC: @ervteng - do you have any thoughts on this scenario?
@mantasp
Hi @Erethan, what are your observation space/action space like? Are you using LSTMs? There may be an issue with Barracuda with certain model architectures - this'll help us narrow down the issue.
My hyperparameters were the same as the Pyramids example, so I don't think I'm using LSTM.
The goal of the agent is to pick up a resource by standing in front of it, and returning to the center of the environment.
Observation space:
Action State is discrete 1st int -> Horizontal input [-1, -0.5, 0, 0.5, 1] 2nd int -> Vertical input [-1, -0.5, 0, 0.5, 1]
Behaviour during training: The agent fiddles towards the resource, but quickly picks it up then returns to the center. Behaviour during inference: The agent goes to some specific position and jitters around it.
If it would help you, I could upload&share the project as well as the the model's folder
Cool, thanks for the info! We'll definitely look into this. Another question - have you tried doing inference using Python and --slow
and does it change behavior? There is sometimes differences in environment behavior depending on timescale.
I haven't tried it. I didn't know you could also do inference with the Python API. Could you point me to a resource on this subject?
Also, I've found this on the Limitations documentation page:
Rendering Speed and Synchronization Currently the speed of the game physics can only be increased to 100x real-time. The Academy also moves in time with FixedUpdate() rather than Update(), so game behavior implemented in Update() may be out of sync with the agent decision making. See Execution Order of Event Functions for more information.
Could this be related to my problem?
EDIT: I understand know that you could do either inference or training with the Python API (mlagents.env VS mlagents-learn). Is that correct? So if I understood correctly, I would need to create an excecution environment then use the instructions detailed here and here. Is there anyway I can do that with the command line instead of writing a Python script?
Hi @Erethan, you don't need to write any more Python code. Just add the --slow
param while running mlagents-learn, and leave off the --train
, and the Python code will be run in inference mode.
If you're still having issues with Barracuda, try using the develop-barracuda-0.2.0
branch of this repo. We've fixed some issues with Discrete actions there, and will merge these changes soon.
--slow
seems to have fixed the problem.
Thank you!
I preferred a lot to train with --slow
. You can run with many more environments (I did with 30x more), and I can understand better what the agents are doing. Loved it!
Is there any setback I should be worried about when training with --slow
?
Hi @Erethan I am having the same problem. Are you using --slow
when training instead of --train
?
Then the issue occurred, I wasn't using --slow.
Turning --slow on is what solved my issue. (Just to be sure; you have to both write --slow and --train)
Thanks! I'll give that a go
Thank you for the discussion. We are closing this issue due to inactivity.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
I trained a network until I've got the behaviour I needed. However, whenever I import the result model into my Agent Brain, the agent takes completely different actions. What could have caused that?
If I run
mlagents-learn
with--load
, during the training, the agent correctly reloads from where it started and it gets back the intended behaviour. The issue only happens during inference.