Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
17.18k stars 4.16k forks source link

CPU vs GPU inference results #3052

Closed fog9001 closed 4 years ago

fog9001 commented 4 years ago

Hi,

I updated my environment from 0.11 to 0.12 and TF from 1.14 to 2.0.

No issues training at all, but GPU inference is not working as before (same performance (as cpu inference) and poor results).

CPU inference -> https://www.youtube.com/watch?v=rumkVXGpP-8 GPU -> https://www.youtube.com/watch?v=R-VYv9gbqck

Cheers

chriselion commented 4 years ago

Hi @fog9001, Can you share a copy of the frozen graph that is output from training? Even if it just has random weights (no training run), that should help us diagnose the differences.

cc @mantasp

fog9001 commented 4 years ago

Hi,

Sure, sharing Zip file containing ´frozen_graph_def.pb´ and model (.nn).

https://drive.google.com/file/d/1lQ6JmyLSGdHcpRyK88wQIbQhr93wtAp0/view?usp=sharing

Thank you @chriselion

chriselion commented 4 years ago

Thanks @fog9001 - I'll take a look today and see if anything jumps out, but it will probably need the Barracuda team to look into this, and they're all at the NeurIPS conference this week.

fog9001 commented 4 years ago

Thanks, Cpu inference is perfectly fine for me now, no rush 👍

chriselion commented 4 years ago

Nothing surprising from the barracuda conversion:

Converting ./gpu_inference.pb to ./gpu_inference.nn
WARNING:tensorflow:From /Users/chris.elion/code/ml-agents/ml-agents/mlagents/trainers/tensorflow_to_barracuda.py:1541: The name tf.GraphDef is deprecated. Please use tf.compat.v1.GraphDef instead.

IGNORED: StopGradient unknown layer
GLOBALS: 'is_continuous_control', 'version_number', 'memory_size', 'action_output_shape'
IN: 'vector_observation': [-1, 1, 1, 894] => 'main_graph_0/hidden_0/BiasAdd'
IN: 'epsilon': [-1, 1, 1, 2] => 'mul'
OUT: 'action', 'action_probs'
DONE: wrote ./gpu_inference.nn file.

I'm going to have to let the Barracuda folks handle this after the conference.

mantasp commented 4 years ago

Issue added to our internal tracker.

mantasp commented 4 years ago

Issue was fixed on Barracuda side and fix should appear in the next release of Barracuda

fog9001 commented 4 years ago

Thank you @mantasp

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had activity in the last 14 days. It will be closed in the next 14 days if no further activity occurs. Thank you for your contributions.

mantasp commented 4 years ago

@fog9001 Barracuda 0.4.0-preview shipped yesterday. Could you please try to update your Barracuda version to this one (via Unity Package Manager) and see if it fixed your issue. Note: that you might need to "re-import" .nn files in you project after this upgrade. You can do it by right-clicking on .nn file and picking "Reimport".

fog9001 commented 4 years ago

Fixed, just updated to 0.4.0-preview and it worked with no issues, thank you so much @mantasp .

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.