--inference with new run-id is not working as expected.

ChristianCoenen commented 4 years ago

Is your feature request related to a problem? Please describe. It seems like that it's not possible to comfortably gather data for tensorboard with an agent running in inference mode.

I came across this while trying to explain the tensorboard graph 'Is Training' which is always 1 when using tensorboard for training analysis. I then thought of use cases where 'Is Training' is 0 (inference). At the moment the second use cases I came up with (see last section) is not working with mlagents.

Scenarios

When running an agent in Unity with the Behavior Type 'default' and mlagents (python) not running, it'll run inference at 1x speed. When changing it to 'Inference only' it'll run inference at 20x speed. No data created.
Using a new run-id and the --inference flag results in a new random policy that is not able to train and therefore useless. Useless data created.
Using a known run-id and the --inference argument + --resume argument works, but overwrites its own data when running again. Data created

Describe the solution you'd like As said earlier I thought of two use cases:

Compare data gathered with a model in inference mode (no exploration) with data from the end of training (some exploration).
Check how a model performs when changing the environment (then one could give it its own run id and see the data generated for this run id in tensorboard)

I guess the third scenario covers the first use case. But as said before with some compromises

It would be great to have the second scenario (which I assume is useless at the moment) covering the second use case.

If there is a way to properly log inference with tensorboard and I just missed it, please let me know!

chriselion commented 4 years ago

I think both of your usecases are supported. Here are tensorboard graphs for 3 runs:

mlagents-learn config/ppo/3DBall.yaml --run-id=3dball_new
mlagents-learn config/ppo/3DBall.yaml --run-id=3dball_new --resume --inference
mlagents-learn config/ppo/3DBall.yaml --run-id=3dball_newer  --initialize-from=3dball_new  --inference

for about 60K steps each.

So at least the rewards are tracked when resuming (usecase 1), and you can use --initialize-from to provide a new run id (usecase 2). Are there other tensorboard metrics you expected to see during inference?

For your scenario 1, are you expecting the 20x playback speed, or that's what you're observing? I don't think this will happen currently.

Also note that you can use the StatsRecorder interface in C# to write custom metrics from C# to appear in tensorboard.

ChristianCoenen commented 4 years ago

Thanks a lot! I didn't consider the --initialize-from argument. It's indeed covering usecase 2.

For now, I don't miss any metrics - it might change when I shift to more complex problems over time. Thanks for the mentioning of the StatsRecorder. Will look into it then.

It's what I am observing. When changing from 'default' to 'inference only', speed will increase (I am not exactly sure how much - but I guess 20x). I don't have a problem with that behavior though.

Off-topic: Another user and I still waiting for a final reply to my issue #4125 - I think it's a quick answer for you and would close the issue. Would be great if you could check it out. Thanks!

chriselion commented 4 years ago

It's what I am observing. When changing from 'default' to 'inference only', speed will increase (I am not exactly sure how much - but I guess 20x). I don't have a problem with that behavior though.

Hmm, I tried to reproduce this on our 3DBall scene; I tried setting a single agent to Inference Only, as well as the prefab that all the agents use, and in both cases, running (without python) happened at the normal speed. If you can reproduce the behavior in one of our scenes, I'll look into it more.

The person responding on #4125 has been on vacation but will be back tomorrow. I'll remind them about it in the morning.

ChristianCoenen commented 4 years ago

It was my bad. I think I had the trainer running when I tried it a few days ago. I must have moved it into the 'trainer is not running' category because it doesn't receive data from the environment and crashes after 30s - but cause it connects, it starts the environment with approx. 20x speed.

All right, thanks!

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Unity-Technologies / ml-agents

--inference with new run-id is not working as expected. #4143