Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
17.18k stars 4.16k forks source link

vis_encoder_type is overridden #5614

Closed Simple2Sample closed 2 years ago

Simple2Sample commented 2 years ago

Describe the bug I'm having issues where I the vis_encode_type defaults to simple regardless of what option or size of the neural network I use. I'm trying to make it create a fully_connected network, but ML-agents simply ignores the parameter for some reason. As fully_connected does not have a minimum size, it should not in my opinion override my vis_encode_type setting

The ML.yaml file I used:

  ML:
    trainer_type: ppo
    hyperparameters:
      batch_size: 1024
      buffer_size: 32768
      learning_rate: 2.5e-3
      learning_rate_schedule: constant
      beta: 1e-2
      epsilon: 0.1
      lambd: 0.98
      num_epoch: 2
    network_settings:
      normalize: false
      hidden_units: 32
      num_layers: 2
      vis_encoder_type: fully_connected
    reward_signals:
      extrinsic:
        strength: 1.0
        gamma: 0.999
      curiosity:
        strength: 0.01
        gamma: 0.8
    max_steps: 3e7
    time_horizon: 32
    summary_freq: 50000
    keep_checkpoints: 5
    checkpoint_interval: 500000
    threaded: true
    init_path: null

What the terminal outputs. Notice the change in vis_enode_type:

[INFO] Hyperparameters for behavior name ML:
        trainer_type:   ppo
        hyperparameters:
          batch_size:   1024
          buffer_size:  32768
          learning_rate:        0.0025
          beta: 0.01
          epsilon:      0.1
          lambd:        0.98
          num_epoch:    2
          learning_rate_schedule:       constant
        network_settings:
          normalize:    False
          hidden_units: 32
          num_layers:   2
          vis_encode_type:      simple
          memory:       None
          goal_conditioning_type:       hyper
        reward_signals:
          extrinsic:
            gamma:      0.999
            strength:   1.0
            network_settings:
              normalize:        False
              hidden_units:     128
              num_layers:       2
              vis_encode_type:  simple
              memory:   None
              goal_conditioning_type:   hyper
          curiosity:
            gamma:      0.8
            strength:   0.01
            network_settings:
              normalize:        False
              hidden_units:     128
              num_layers:       2
              vis_encode_type:  simple
              memory:   None
              goal_conditioning_type:   hyper
            learning_rate:      0.0003
            encoding_size:      None
        init_path:      None
        keep_checkpoints:       5
        checkpoint_interval:    500000
        max_steps:      30000000
        time_horizon:   32
        summary_freq:   50000
        threaded:       True
        self_play:      None
        behavioral_cloning:     None

Environment:

Simple2Sample commented 2 years ago

I think I found the issue. I thought vis_encode_type was convoluting all of the observations, not just the visual observations. I also don't have any visual observations in my project at this time. Does this mean it just defaults to "simple" because I have no visual observations?

Follow-up question: Are there any ways to add convolution layers or change the width of individual layers in the NN? I assume the NN is fully connected.

andrewcoh commented 2 years ago

Hi @Simple2Sample

The available types for vis_encode_type are documented here: https://github.com/Unity-Technologies/ml-agents/blob/main/docs/Training-Configuration-File.md#common-trainer-configurations

The default is 'simple'. It seems that we did not clean up our documentation though, as fully_connected is listed as a potential option in the above documentation but is not supported in the code. I will clean up the documentation and add a catch so that specifying an unsupported visual encoder type will raise a warning.

That being said, a visual encoder will only be instantiated if your agent has a camera sensor. Unfortunately, it's not possible to modify the convolution hyperparameters from the yaml, however, if you are comfortable you can modify the encoders directly here: https://github.com/Unity-Technologies/ml-agents/blob/main/ml-agents/mlagents/trainers/torch/encoders.py#L177

Simple2Sample commented 2 years ago

Ah thanks a lot for the clarification!

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had activity in the last 28 days. It will be closed in the next 14 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 2 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.