Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
16.93k stars 4.13k forks source link

Behavioral Cloning (BC) / pretraining with recurrent=true #2962

Closed njustesen closed 4 years ago

njustesen commented 4 years ago

Describe the bug I get the following error message during behavioral cloning or pretraining with recurrent=true.

To Reproduce Steps to reproduce the behavior:

  1. Setup new conda environment with python=3.7
  2. Pull and install ml-agents: pip install .
  3. Set recurrent=true in config/offline_bc_config.yaml
  4. Run mlagents-learn config/offline_bc_config.yaml --run-id=pyramids-update-rnn-test --train
  5. Press play in the Basic Unity scene from examples.
  6. Observe the error within a few seconds.

Console logs / stack traces

$ mlagents-learn config/offline_bc_config.yaml --run-id=rnn-bc-test --train
Traceback (most recent call last):
  File "/Users/njustesen/anaconda3/envs/ml-test/bin/mlagents-learn", line 8, in <module>
    sys.exit(main())
  File "/Users/njustesen/anaconda3/envs/ml-test/lib/python3.7/site-packages/mlagents/trainers/learn.py", line 408, in main
    run_training(0, run_seed, options, Queue())
  File "/Users/njustesen/anaconda3/envs/ml-test/lib/python3.7/site-packages/mlagents/trainers/learn.py", line 253, in run_training
    tc.start_learning(env)
  File "/Users/njustesen/anaconda3/envs/ml-test/lib/python3.7/site-packages/mlagents/trainers/trainer_controller.py", line 209, in start_learning
    n_steps = self.advance(env_manager)
  File "/Users/njustesen/anaconda3/envs/ml-test/lib/python3.7/site-packages/mlagents/envs/timers.py", line 263, in wrapped
    return func(*args, **kwargs)
  File "/Users/njustesen/anaconda3/envs/ml-test/lib/python3.7/site-packages/mlagents/trainers/trainer_controller.py", line 297, in advance
    trainer.update_policy()
  File "/Users/njustesen/anaconda3/envs/ml-test/lib/python3.7/site-packages/mlagents/trainers/bc/trainer.py", line 138, in update_policy
    run_out = self.policy.update(mini_batch, self.n_sequences)
  File "/Users/njustesen/anaconda3/envs/ml-test/lib/python3.7/site-packages/mlagents/trainers/bc/policy.py", line 95, in update
    run_out = self._execute_model(feed_dict, self.update_dict)
  File "/Users/njustesen/anaconda3/envs/ml-test/lib/python3.7/site-packages/mlagents/trainers/tf_policy.py", line 151, in _execute_model
    network_out = self.sess.run(list(out_dict.values()), feed_dict=feed_dict)
  File "/Users/njustesen/anaconda3/envs/ml-test/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/Users/njustesen/anaconda3/envs/ml-test/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1156, in _run
    (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (0,) for Tensor 'teacher_action:0', which has shape '(?, 1)'

Environment (please complete the following information):

chriselion commented 4 years ago

@njustesen Thanks for the report. We think we found and fixed the problem in https://github.com/Unity-Technologies/ml-agents/pull/2965

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.