Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
17.19k stars 4.16k forks source link

Pretrained with GAIL and Curiosity #2611

Closed TiNovTec closed 4 years ago

TiNovTec commented 5 years ago

Describe the bug Pretrained with GAIL and Curiosity doesn't seem to work. I seem to have the same problem like @arixlin. Without curiosity it works fine: https://github.com/Unity-Technologies/ml-agents/issues/2401

To Reproduce from gail_config.yaml: GridWorldLearning: summary_freq: 2000 time_horizon: 5 batch_size: 32 normalize: false buffer_size: 2048 hidden_units: 256 num_layers: 1 beta: 5.0e-3 max_steps: 5.0e4 num_epoch: 3 pretraining: demo_path: ./demos/GridWorldTrained.demo strength: 0.5 steps: 10000 reward_signals: extrinsic: strength: 1.0 gamma: 0.99 curiosity: strength: 0.02 gamma: 0.99 encoding_size: 256 gail: strength: 0.01 gamma: 0.99 encoding_size: 128 demo_path: ./demos/GridWorldTrained.demo

Console logs / stack traces Errors in Anaconda: Traceback (most recent call last): File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\runpy.py", line 85, in run_code exec(code, run_globals) File "C:\Users\twi\AppData\Local\Continuum\anaconda3\envs\ml-agents\Scripts\mlagents-learn.exe_main.py", line 9, in File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\mlagents\trainers\learn.py", line 319, in main run_training(0, run_seed, options, Queue()) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\mlagents\trainers\learn.py", line 118, in run_training tc.start_learning(env, trainer_config) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\mlagents\trainers\trainer_controller.py", line 283, in start_learning self.initialize_trainers(trainer_config, env_manager.external_brains) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\mlagents\trainers\trainer_controller.py", line 206, in initialize_trainers run_id=self.run_id, File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\mlagents\trainers\ppo\trainer.py", line 68, in init self.policy = PPOPolicy(seed, brain, trainer_parameters, self.is_training, load) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\mlagents\trainers\ppo\policy.py", line 57, in init self, reward_signal, config File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\mlagents\trainers\components\reward_signals\reward_signal_factory.py", line 40, in create_reward_signal class_inst = rcls(policy, *config_entry) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\mlagents\trainers\components\reward_signals\gail\signal.py", line 51, in init policy.model, 128, learning_rate, encoding_size, use_actions, use_vail File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\mlagents\trainers\components\reward_signals\gail\model.py", line 41, in init self.make_inputs() File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\mlagents\trainers\components\reward_signals\gail\model.py", line 125, in make_inputs False, File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\mlagents\trainers\models.py", line 249, in create_visual_observation_encoder name="conv_1", File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\tensorflow\python\layers\convolutional.py", line 619, in conv2d return layer.apply(inputs) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\tensorflow\python\layers\base.py", line 825, in apply return self.call(inputs, args, kwargs) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\tensorflow\python\layers\base.py", line 696, in call self.build(input_shapes) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\tensorflow\python\layers\convolutional.py", line 144, in build dtype=self.dtype) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\tensorflow\python\layers\base.py", line 546, in add_variable partitioner=partitioner) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\tensorflow\python\training\checkpointable.py", line 415, in _add_variable_with_custom_getter kwargs_for_getter) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1297, in get_variable constraint=constraint) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1093, in get_variable constraint=constraint) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 439, in get_variable constraint=constraint) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 408, in _true_getter use_resource=use_resource, constraint=constraint) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 747, in _get_single_variable name, "".join(traceback.format_list(tb)))) ValueError: Variable stream_0_visual_obs_encoder/conv_1/kernel already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\tensorflow\python\framework\ops.py", line 1654, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\tensorflow\python\framework\ops.py", line 3290, in create_op op_def=op_def) File "c:\users\twi\appdata\local\continuum\anaconda3\envs\ml-agents\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper op_def=op_def)

Environment (please complete the following information):

ervteng commented 5 years ago

Hi @TiNovTec, as far as I know this bug was fixed back in 0.9.1. You might have to do a pip install --upgrade ml-agents to get the latest build installed in your virtual environment.

TiNovTec commented 5 years ago

@ervteng When I tried to use your command in Anaconda with environment set to ml-agents, I got this error: Could not find a version that satisfies the requirement ml-agents (from versions: ) No matching distribution found for ml-agents You are using pip version 9.0.1, however version 19.2.3 is available. You should consider upgrading via the 'python -m pip install --upgrade pip' command.

Then I updated the pip to 19.2.3 but still get this error: ERROR: Could not find a version that satisfies the requirement ml-agents (from versions: none) ERROR: No matching distribution found for ml-agents

ervteng commented 5 years ago

Hi @TiNovTec, did you install using pip from the directory? Try installing again from the directory in that case. Maybe uninstall ml-agents using pip, and reinstall?

ervteng commented 4 years ago

This bug has been fixed - closing the issue for now. Feel free to post a new issue if you're still having problems.

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.