Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
17.18k stars 4.16k forks source link

Errors and warnings when running inference with SAC + vis_encode_type: resnet model #5341

Closed MrOCW closed 2 years ago

MrOCW commented 3 years ago

Describe the bug Many errors and warnings while attempting to infer with trained SAC onnx model. Inference with CPU removed some of the errors and warning but AssertionException: Assertion failure. Values are not equal. remains. After training another SAC with vis_encode_type: simple instead of resnet, inference with GPU works fine so the issue seems to lie with IMPALA/resnet and maybe nature cnn as well? Also, I am getting [subprocess_env_manager.py:220] UnityEnvironment worker 0: environment stopping. after about 90000+ steps for SAC simple vis_encode_type and my training just stops. I am able to resume but.... I got this instead.

2021-05-05 22:54:20 INFO [stats.py:188] Hyperparameters for behavior name Car: 
    trainer_type:   sac
    hyperparameters:    
      learning_rate:    0.0003
      learning_rate_schedule:   constant
      batch_size:   128
      buffer_size:  100000
      buffer_init_steps:    10
      tau:  0.005
      steps_per_update: 10.0
      save_replay_buffer:   True
      init_entcoef: 0.01
      reward_signal_steps_per_update:   10.0
    network_settings:   
      normalize:    False
      hidden_units: 128
      num_layers:   1
      vis_encode_type:  simple
      memory:   None
    reward_signals: 
      extrinsic:    
        gamma:  0.99
        strength:   1.0
        network_settings:   
          normalize:    False
          hidden_units: 128
          num_layers:   2
          vis_encode_type:  simple
          memory:   None
    init_path:  None
    keep_checkpoints:   5
    checkpoint_interval:    100000
    max_steps:  750000
    time_horizon:   4
    summary_freq:   1000
    threaded:   True
    self_play:  None
    behavioral_cloning: None
2021-05-05 22:54:23 INFO [trainer.py:119] Loading Experience Replay Buffer from results/CarSquare3/Car/last_replay_buffer.hdf5
2021-05-05 22:54:23 WARNING [rl_trainer.py:180] Trainer has no policies, not saving anything.
2021-05-05 22:54:23 INFO [trainer.py:107] Saving Experience Replay Buffer to results/CarSquare3/Car/last_replay_buffer.hdf5...
2021-05-05 22:54:23 INFO [trainer.py:111] Saved Experience Replay Buffer (800 bytes).
2021-05-05 22:54:23 INFO [trainer_controller.py:81] Saved Model
Traceback (most recent call last):
  File "/home/student/anaconda3/envs/mlagents1.9.1/bin/mlagents-learn", line 8, in <module>
    sys.exit(main())
  File "/home/student/anaconda3/envs/mlagents1.9.1/lib/python3.6/site-packages/mlagents/trainers/learn.py", line 250, in main
    run_cli(parse_command_line())
  File "/home/student/anaconda3/envs/mlagents1.9.1/lib/python3.6/site-packages/mlagents/trainers/learn.py", line 246, in run_cli
    run_training(run_seed, options)
  File "/home/student/anaconda3/envs/mlagents1.9.1/lib/python3.6/site-packages/mlagents/trainers/learn.py", line 125, in run_training
    tc.start_learning(env_manager)
  File "/home/student/anaconda3/envs/mlagents1.9.1/lib/python3.6/site-packages/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/home/student/anaconda3/envs/mlagents1.9.1/lib/python3.6/site-packages/mlagents/trainers/trainer_controller.py", line 173, in start_learning
    self._reset_env(env_manager)
  File "/home/student/anaconda3/envs/mlagents1.9.1/lib/python3.6/site-packages/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/home/student/anaconda3/envs/mlagents1.9.1/lib/python3.6/site-packages/mlagents/trainers/trainer_controller.py", line 107, in _reset_env
    self._register_new_behaviors(env_manager, env_manager.first_step_infos)
  File "/home/student/anaconda3/envs/mlagents1.9.1/lib/python3.6/site-packages/mlagents/trainers/trainer_controller.py", line 267, in _register_new_behaviors
    self._create_trainers_and_managers(env_manager, new_behavior_ids)
  File "/home/student/anaconda3/envs/mlagents1.9.1/lib/python3.6/site-packages/mlagents/trainers/trainer_controller.py", line 166, in _create_trainers_and_managers
    self._create_trainer_and_manager(env_manager, behavior_id)
  File "/home/student/anaconda3/envs/mlagents1.9.1/lib/python3.6/site-packages/mlagents/trainers/trainer_controller.py", line 140, in _create_trainer_and_manager
    create_graph=True,
  File "/home/student/anaconda3/envs/mlagents1.9.1/lib/python3.6/site-packages/mlagents/trainers/trainer/rl_trainer.py", line 119, in create_policy
    return self.create_torch_policy(parsed_behavior_id, behavior_spec)
  File "/home/student/anaconda3/envs/mlagents1.9.1/lib/python3.6/site-packages/mlagents/trainers/sac/trainer.py", line 240, in create_torch_policy
    self.maybe_load_replay_buffer()
  File "/home/student/anaconda3/envs/mlagents1.9.1/lib/python3.6/site-packages/mlagents/trainers/sac/trainer.py", line 212, in maybe_load_replay_buffer
    self.load_replay_buffer()
  File "/home/student/anaconda3/envs/mlagents1.9.1/lib/python3.6/site-packages/mlagents/trainers/sac/trainer.py", line 121, in load_replay_buffer
    self.update_buffer.load_from_file(file_object)
  File "/home/student/anaconda3/envs/mlagents1.9.1/lib/python3.6/site-packages/mlagents/trainers/buffer.py", line 453, in load_from_file
    with h5py.File(file_object, "r") as read_file:
  File "/home/student/.local/lib/python3.6/site-packages/h5py/_hl/files.py", line 408, in __init__
    swmr=swmr)
  File "/home/student/.local/lib/python3.6/site-packages/h5py/_hl/files.py", line 173, in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (file signature not found)

When i try again,

2021-05-05 23:03:57 INFO [trainer.py:119] Loading Experience Replay Buffer from results/CarSquare3/Car/last_replay_buffer.hdf5
2021-05-05 23:03:57 INFO [trainer.py:124] Experience replay buffer has 0 experiences.

To Reproduce Steps to reproduce the behavior:

  1. Train SAC model with vis_encode_type: resnet, CameraSensor 144x108 Grayscale, time horizon 4
  2. Run inference with GPU
  3. Attempt Step 1 again but with simple vis_encode_type with parameters as above config file.
  4. Attempt to resume if environment stopped

Screenshots image

Environment:

surfnerd commented 3 years ago

Hi @MrOCW, Thanks for bringing this conversation to our GitHub issues. Could you provide the version of the mlagents pip package you are using?

MrOCW commented 3 years ago

@surfnerd 0.25.1

surfnerd commented 3 years ago

Thanks. Are you sure that the inputs to your trained model are the same as what you are currently using in your project. I'd like to make sure my understanding is correct.

The issue that you seem to be having is that inference on CPU works, but on GPU it doesn't. And these errors you are seeing are when you are running inference on GPU. Is that correct?

MrOCW commented 3 years ago

@surfnerd yup! just 1 CameraSensor with width and height unchanged

These errors are seen when inference is done on GPU. On CPU, the only issue is: AssertionException: Assertion failure. Values are not equal.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had activity in the last 28 days. It will be closed in the next 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] commented 2 years ago

This issue has been automatically closed because it has not had activity in the last 42 days. If this issue is still valid, please ping a maintainer. Thank you for your contributions.

stale[bot] commented 2 years ago

This issue has been automatically closed because it has not had activity in the last 42 days. If this issue is still valid, please ping a maintainer. Thank you for your contributions.

github-actions[bot] commented 2 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.