Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
16.93k stars 4.14k forks source link

Environment Parameters currently not showing correctly in Tensorboard? #4325

Closed Ihsees closed 4 years ago

Ihsees commented 4 years ago

Describe the bug For me all environment-parameters (as configured in the config.yaml file) show up as 0s in tensorboard at all timesteps.

To Reproduce Run Training on the WallJump-scene with default settings and the following command: mlagents-learn config\ppo\WallJump_curriculum.yaml --run-id=walljump_curr --quality-level=1 --time-scale=5

Console logs / stack traces Please wrap in triple backticks (```) to make it easier to read.

(tensorflow-gpu) C:\Users\User\Documents\kicker_mlAgents_connectedThroughGithub\ml-agents>mlagents-learn config\ppo\WallJump_curriculum.yaml --run-id=walljump_curr --quality-level=1 --time-scale=5 

                        ▄▄▄▓▓▓▓
                   ╓▓▓▓▓▓▓█▓▓▓▓▓
              ,▄▄▄m▀▀▀'  ,▓▓▓▀▓▓▄                           ▓▓▓  ▓▓▌
            ▄▓▓▓▀'      ▄▓▓▀  ▓▓▓      ▄▄     ▄▄ ,▄▄ ▄▄▄▄   ,▄▄ ▄▓▓▌▄ ▄▄▄    ,▄▄
          ▄▓▓▓▀        ▄▓▓▀   ▐▓▓▌     ▓▓▌   ▐▓▓ ▐▓▓▓▀▀▀▓▓▌ ▓▓▓ ▀▓▓▌▀ ^▓▓▌  ╒▓▓▌
        ▄▓▓▓▓▓▄▄▄▄▄▄▄▄▓▓▓      ▓▀      ▓▓▌   ▐▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▌   ▐▓▓▄ ▓▓▌
        ▀▓▓▓▓▀▀▀▀▀▀▀▀▀▀▓▓▄     ▓▓      ▓▓▌   ▐▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▌    ▐▓▓▐▓▓
          ^█▓▓▓        ▀▓▓▄   ▐▓▓▌     ▓▓▓▓▄▓▓▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▓▄    ▓▓▓▓`
            '▀▓▓▓▄      ^▓▓▓  ▓▓▓       └▀▀▀▀ ▀▀ ^▀▀    `▀▀ `▀▀   '▀▀    ▐▓▓▌
               ▀▀▀▀▓▄▄▄   ▓▓▓▓▓▓,                                      ▓▓▓▓▀
                   `▀█▓▓▓▓▓▓▓▓▓▌
                        ¬`▀▀▀█▓

 Version information:
  ml-agents: 0.18.1,
  ml-agents-envs: 0.18.1,
  Communicator API: 1.0.0,
  TensorFlow: 1.7.1
2020-08-09 17:36:26 INFO [environment.py:199] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
2020-08-09 17:36:32 INFO [environment.py:108] Connected to Unity environment with package version 1.2.0-preview and communication version 1.0.0
2020-08-09 17:36:32 INFO [environment.py:265] Connected new brain:
SmallWallJump?team=0
2020-08-09 17:36:32.833849: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-08-09 17:36:32.908663: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1344] Found device 0 with properties:
name: GeForce GTX 770 major: 3 minor: 0 memoryClockRate(GHz): 1.163
pciBusID: 0000:01:00.0
totalMemory: 2.00GiB freeMemory: 1.63GiB
2020-08-09 17:36:32.916024: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1423] Adding visible gpu devices: 0
2020-08-09 17:36:33.206970: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-09 17:36:33.213265: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:917]      0
2020-08-09 17:36:33.218066: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:930] 0:   N
2020-08-09 17:36:33.222372: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1415 MB memory) -> physical GPU (device: 0, name: GeForce GTX 770, pci bus id: 0000:01:00.0, compute capability: 3.0)
2020-08-09 17:36:33 INFO [stats.py:131] Hyperparameters for behavior name SmallWallJump:
        trainer_type:   ppo
        hyperparameters:
          batch_size:   128
          buffer_size:  2048
          learning_rate:        0.0003
          beta: 0.005
          epsilon:      0.2
          lambd:        0.95
          num_epoch:    3
          learning_rate_schedule:       linear
        network_settings:
          normalize:    False
          hidden_units: 256
          num_layers:   2
          vis_encode_type:      simple
          memory:       None
        reward_signals:
          extrinsic:
            gamma:      0.99
            strength:   1.0
        init_path:      None
        keep_checkpoints:       5
        checkpoint_interval:    500000
        max_steps:      5000000
        time_horizon:   128
        summary_freq:   20000
        threaded:       True
        self_play:      None
        behavioral_cloning:     None
2020-08-09 17:36:33.262881: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1423] Adding visible gpu devices: 0
2020-08-09 17:36:33.267551: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-09 17:36:33.273643: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:917]      0
2020-08-09 17:36:33.277012: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:930] 0:   N
2020-08-09 17:36:33.281510: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1415 MB memory) -> physical GPU (device: 0, name: GeForce GTX 770, pci bus id: 0000:01:00.0, compute capability: 3.0)
2020-08-09 17:36:35 INFO [environment.py:265] Connected new brain:
BigWallJump?team=0
2020-08-09 17:36:35 WARNING [env_manager.py:117] Agent manager was not created for behavior id BigWallJump?team=0.
2020-08-09 17:36:35.651323: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1423] Adding visible gpu devices: 0
2020-08-09 17:36:35.657418: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-09 17:36:35.662435: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:917]      0
2020-08-09 17:36:35.665997: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:930] 0:   N
2020-08-09 17:36:35.669045: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1415 MB memory) -> physical GPU (device: 0, name: GeForce GTX 770, pci bus id: 0000:01:00.0, compute capability: 3.0)
2020-08-09 17:36:35 INFO [stats.py:131] Hyperparameters for behavior name BigWallJump:
        trainer_type:   ppo
        hyperparameters:
          batch_size:   128
          buffer_size:  2048
          learning_rate:        0.0003
          beta: 0.005
          epsilon:      0.2
          lambd:        0.95
          num_epoch:    3
          learning_rate_schedule:       linear
        network_settings:
          normalize:    False
          hidden_units: 256
          num_layers:   2
          vis_encode_type:      simple
          memory:       None
        reward_signals:
          extrinsic:
            gamma:      0.99
            strength:   1.0
        init_path:      None
        keep_checkpoints:       5
        checkpoint_interval:    500000
        max_steps:      20000000
        time_horizon:   128
        summary_freq:   20000
        threaded:       True
        self_play:      None
        behavioral_cloning:     None
2020-08-09 17:36:35.694312: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1423] Adding visible gpu devices: 0
2020-08-09 17:36:35.700461: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-09 17:36:35.707927: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:917]      0
2020-08-09 17:36:35.712857: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:930] 0:   N
2020-08-09 17:36:35.715894: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1415 MB memory) -> physical GPU (device: 0, name: GeForce GTX 770, pci bus id: 0000:01:00.0, compute capability: 3.0)
2020-08-09 17:37:50 INFO [stats.py:112] BigWallJump: Step: 20000. Time Elapsed: 85.775 s Mean Reward: -0.840. Std of Reward: 0.658. Training.
2020-08-09 17:38:29 INFO [stats.py:112] SmallWallJump: Step: 20000. Time Elapsed: 125.354 s Mean Reward: -0.796. Std of Reward: 0.679. Training.
2020-08-09 17:39:14 INFO [stats.py:112] BigWallJump: Step: 40000. Time Elapsed: 169.879 s Mean Reward: 0.090. Std of Reward: 0.922. Training.
2020-08-09 17:40:09 INFO [stats.py:112] SmallWallJump: Step: 40000. Time Elapsed: 224.578 s Mean Reward: -0.127. Std of Reward: 0.946. Training.

Screenshots If applicable, add screenshots to help explain your problem. grafik

Environment (please complete the following information):

  ml-agents: 0.18.1,
  ml-agents-envs: 0.18.1,
  Communicator API: 1.0.0,
  TensorFlow: 1.7.1

NOTE: We are unable to help reproduce bugs with custom environments. Please attempt to reproduce your issue with one of the example environments, or provide a minimal patch to one of the environments needed to reproduce the issue.

Ihsees commented 4 years ago

It doesn't show the actual value, does it? In only shows the lesson?

It even SAYS it's the lesson... welp grafik

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.