Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
16.93k stars 4.13k forks source link

Training results different than when running inference and loading model. #4424

Closed HindBoucherit closed 4 years ago

HindBoucherit commented 4 years ago

Hello everybody, hope you're doing alright,

I am having a very confusing issue while running Inference with my project.

I've been training a self-driving car to follow waypoints by controlling the steering, the brake and the throttle. By the end of the training when I train it on a complex road it works quiet well, but when I save the model and give the agent the same road while running inference, the result is a complete mess.

Is there any possible issue that can cause the model not to be loaded correctly ? Or the agent to behave completely randomly rather than act like it was while it was training ? The training have been going on for a few hours now with 14 agents in the scene training. They all behave really well by the end, but why isn't it the case when I run inference and hand them the trained model ?

I'm not sure if there's something wrong with the baraccuda dependencies, I came across an issue on the internet that was linked to a missing plugin (TFSharpPlugin), but it seems to be related to older versions of mlagent, so I'm kind of lost.

If anybody has an idea or a suggestion for this issue I'd be very grateful.

Thanks for reading, have a good day !

Hind

harperj commented 4 years ago

Hi @HindBoucherit --

We aren't aware of any Barracuda inference bugs with ML-Agents at this time. It's difficult for us to help without more information about the environment where you saw the issue. Please add what information you can from our bug report template: https://github.com/Unity-Technologies/ml-agents/issues/new?assignees=&labels=bug&template=bug_report.md

Generally speaking we aren't able to help with issues related to custom environments because we can't easily reproduce them. If possible could you also try to reproduce this issue with one of our example environments? If it only happens in your environment, it would be good to know more details about how your environment is set up.

HindBoucherit commented 4 years ago

Hello @harperj ! Thank you for your answer.

I tried training a model from the available example environments and it worked just fine, so the bug is certainly coming from my environment.

Here's a screen of how my environment is set up :

env

Each car has a curve of waypoints to follow in the beginning of each episode. The waypoints are supposed to represent the road and each car should try to minimise its cross track error while navigating along the points. The curve is randomized at each episode begin. All of the agents and their waypoints are placed on a unique terrain. Now when I save the model and give it to a car in circular road for example, the training is very good looking, but when I save the model and give it to the agent in the circular road, the car does very random actions, not at all what it has been trained to do.

Here's a video of the performance at the end of the training : https://drive.google.com/file/d/1a-XgoJmyGvtBVq7dRaM9fZY_bvMTeHNo/view?usp=sharing

And here's a video of the performance when I give the agents the trained model : https://drive.google.com/file/d/1CRdhu9iMZInzzx-hg3ul9G5c5G7p-9CR/view?usp=sharing

Excuse the quality of the video. I'm guessing there s something wrong with my environment but the weird part is that the training is fine by the end, and the tensorboard indicators show me that the cumulative reward increases well and that the training is stable.

Thanks for reading.

harperj commented 4 years ago

Thanks for the detailed response @HindBoucherit. Could you also share what the observation / action space are for your environment, and which Unity and ML-Agents package version you're using? That might give a clue as to what's going wrong.

HindBoucherit commented 4 years ago

The action space is discreet with a size of two, first branch is responsible for steering, it can steer left, hold position, or steer right. Second branch is responsible for speed handling, it can add a value of 0.1f to the throttle and set brake to 0f, maintain current adjustments, or add a value of 0.1f to the brake and set throttle to 0f.

BehaviourComponent

My observations are these :

Observations

Its the error computed at the level of the car, and a little farther forward, the speed of the car which I normalized with maximum allowed speed, the average steepness of road within 50m (value between -1f and 1f), the dot vector of the car direction with the direction it should have (guided by the points it should follow), the steer value (a value between -1f anf 1f), and the distance between the car and the point it has to reach next.

Unity version is 2019.3, and mlagents version is 1.1.0.

Thanks again for reading !

harperj commented 4 years ago

Hi @HindBoucherit -- nothing about this setup seems so different than our example environments that it should be problematic. By chance did you adjust the engine's time scale during training? We have seen some issues with inference at a lower time scale causing unexpected behavior.

HindBoucherit commented 4 years ago

Hi @harperj ! Thanks for your answer, after a quick test, it seems to be the issue ! My time scale was set to 20 I guess that was what was making the behaviour seem hectic and out of hand. I tried setting it to 1, the car was able to follow the road but it seemed pretty slow compared to the training, I guess I'll just keep trying with other values until I get one that works fine.

Anyhow, i finally saw a result that's very close to what had been observed during training, so thanks a lot !

harperj commented 4 years ago

Glad that helped @HindBoucherit! I'm going to close this issue but feel free to reopen or open a new issue if you continue to have trouble.

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.