Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
16.91k stars 4.13k forks source link

KEY ERROR #2427

Closed sunrui19941128 closed 4 years ago

sunrui19941128 commented 5 years ago

I got this error when I started Unity while imitating learning!!!!!!Help Me Thanks!ML-Agents0.9.0

Traceback (most recent call last): File "D:\Anaconda3\envs\ml-agents-tutorial\Scripts\mlagents-learn-script.py", line 11, in load_entry_point('mlagents', 'console_scripts', 'mlagents-learn')() File "e:\githubproject\ml-agents\ml-agents\mlagents\trainers\learn.py", line 319, in main run_training(0, run_seed, options, Queue()) File "e:\githubproject\ml-agents\ml-agents\mlagents\trainers\learn.py", line 118, in run_training tc.start_learning(env, trainer_config) File "e:\githubproject\ml-agents\ml-agents\mlagents\trainers\trainer_controller.py", line 297, in start_learning n_steps = self.advance(env_manager) File "e:\githubproject\ml-agents\ml-agents-envs\mlagents\envs\timers.py", line 210, in wrapped return func(*args, *kwargs) File "e:\githubproject\ml-agents\ml-agents\mlagents\trainers\trainer_controller.py", line 364, in advance new_step_infos = env.step() File "e:\githubproject\ml-agents\ml-agents-envs\mlagents\envs\subprocess_env_manager.py", line 166, in step self._queue_steps() File "e:\githubproject\ml-agents\ml-agents-envs\mlagents\envs\subprocess_env_manager.py", line 159, in _queue_steps env_action_info = self._take_step(env_worker.previous_step) File "e:\githubproject\ml-agents\ml-agents-envs\mlagents\envs\timers.py", line 210, in wrapped return func(args, **kwargs) File "e:\githubproject\ml-agents\ml-agents-envs\mlagents\envs\subprocess_env_manager.py", line 253, in _take_step all_action_info[brain_name] = self.policies[brain_name].get_action( KeyError: 'RollerBallPlayer' Key Error

mnsmuts commented 5 years ago

did you find a solution? I am hitting the same problem (suspect it is something simple) same setup trains offline but will not train online

ervteng commented 5 years ago

Hi @sunrui19941128, @mnsmuts, can you post your yaml config file?

sunrui19941128 commented 5 years ago

Ok Immediately submitted I am copy trainer_config.yaml ,and modify config default.See the pictures for the details! config

sunrui19941128 commented 5 years ago

I didn't have this problem in version 0.8.1,But this problem was encountered in 0.9.0,Now I am using 0.8.1, so I can train the Demo normally.I tried a lot in 0.9.0 but nothing worked!Later it was reloaded with a 0.8.1 environment to train normally!

Hi @sunrui19941128, @mnsmuts, can you post your yaml config file?

I didn't have this problem in version 0.8.1,But this problem was encountered in 0.9.0,Now I am using 0.8.1, so I can train the Demo normally.I tried a lot in 0.9.0 but nothing worked!Later it was reloaded with a 0.8.1 environment to train normally!

sunrui19941128 commented 5 years ago

Hi @sunrui19941128, @mnsmuts, can you post your yaml config file?

config

yamashin0922 commented 5 years ago

Can you check if RollerBallPlayer exist in Broadcast Hub in the inspector window of your Academy? If it is TRUE, please try to remove RollerBallPlayer from the inspector window of the Academy.

smshehryar commented 5 years ago

any issues with it?

yamashin0922 commented 5 years ago

I'm not sure if issue is around there. However, I faced the same issue. And I found it as workaround.

sunrui19941128 commented 5 years ago

我不确定问题是否存在。但是,我遇到了同样的问题。我发现它是一种解决方法。 ml-agents whitch version are you using now?

yamashin0922 commented 5 years ago

I'm using 0.9.0.

sunrui19941128 commented 5 years ago

I tried it and it didn't work with 0.9.0!! Then it worked normally in 0.8.1!!It's really frustrating

yamashin0922 commented 5 years ago

I understand your frustration. Do you mean that "KeyError: 'RollerBall Player'" is still shown in the error message even if you remove it from the inspector window in 0.9.0?

mnsmuts commented 5 years ago

Hi, I can train in v0.8.2 but get the key error in v0.9.X. I am currently trying to roll back my environments to get back to a working system. removing the player brain made no difference, yaml file image

pretty much nothing in it except the default - unless that is not allowed?

mnsmuts commented 5 years ago

In my case I get : image

then if I remove the player brain: image

mnsmuts commented 5 years ago

this is version v0.9.1 and I am removing the player brain from the academy.

mnsmuts commented 5 years ago

Just in case it helps - here is the anaconda prompt when trying exactly the same training (imitation 3DBall) in v0.8.2 image

Seems to work well, which is why it feels like v0.9.1 has a bug

nmndwivedi commented 5 years ago

Can you check if RollerBallPlayer exist in Broadcast Hub in the inspector window of your Academy? If it is TRUE, please try to remove RollerBallPlayer from the inspector window of the Academy.

This removes the keyError but the model does not train

yamashin0922 commented 5 years ago

@sunrui19941128 Can you share the screen capture of your inspector window of the Academy?

sunrui19941128 commented 5 years ago

@ sunrui19941128你可以分享学院督察窗口的截屏吗?

Ok,this 0.9.0version screenshort with Acadamy ! Picture one is No delete PlayerBrain . Picture Two is deleted PlayerBrain! I tried your idea!but There are key errors!I suspect a version problem!sam time 0.8.1 no Problem!

nodelete delete

yamashin0922 commented 5 years ago

Thank you for sharing it. As you said, there seems to be bug around there since the behavior differs between the two versions

To train RollerBallBrain (not RollerBallPlayer), we have to drag the RollerBallBrain asset to the RollerAgent GameObject Brain field to the learning brain. Can you check your Brain field of RollerAgent?

mnsmuts commented 5 years ago

Hi Yes all correct, the agent has a learning brain, the academy has a learning brain without control when creating a build for offline training (.demo), or if I am online training, both a learning brain with control and a player brain.

ervteng commented 5 years ago

Hey @mnsmuts, @yamashin0922, I can confirm that there is a bug in v0.9.1 that is preventing PlayerBrains from being used. I've pushed a fix to the branch hotfix-onlinebc - try checking out that branch and see if it fixes your issue.

If you want to edit yourself, add if brain_name in self.policies: at line 253 in subprocess_env_manager.py. Thank you for bringing this up; it will be fixed in the next release.

AidinD commented 5 years ago

I had the same problem moving up to vs 0.9. However, it seems removing the player brain from the academy hub solves the issue and the agent is training when control is checked. And at the same time, you can still use the player brain any way when unchecking control from the learning brain.

mnsmuts commented 5 years ago

Hey @mnsmuts, @yamashin0922, I can confirm that there is a bug in v0.9.1 that is preventing PlayerBrains from being used. I've pushed a fix to the branch hotfix-onlinebc - try checking out that branch and see if it fixes your issue.

If you want to edit yourself, add if brain_name in self.policies: at line 253 in subprocess_env_manager.py. Thank you for bringing this up; it will be fixed in the next release.

Many thanks

joobei commented 5 years ago

I am also having a problem with KeyError: 'reward_signals'. This is Ubuntu 18.04 with Unity 2019.1.14f1 on anaconda python 3.6

pip list and conda info shows: mlagents 0.9.3 /home/jubei/.local/lib/python3.6/site-packages mlagents-envs 0.9.3 /home/jubei/local/lib/python3.6/site-packages

Here is my log:

INFO:mlagents.trainers:{'--base-port': '5005', '--curriculum': 'None', '--debug': False, '--docker-target-name': 'None', '--env': 'None', '--help': False, '--keep-checkpoints': '5', '--lesson': '0', '--load': False, '--multi-gpu': False, '--no-graphics': False, '--num-envs': '1', '--num-runs': '1', '--run-id': 'test99', '--sampler': 'None', '--save-freq': '50000', '--seed': '-1', '--slow': False, '--train': True, '': 'configs/maze_config.yaml'} INFO:mlagents.envs:Start training by pressing the Play button in the Unity Editor. INFO:mlagents.envs: 'Academy' started successfully! Unity Academy name: Academy Number of Brains: 1 Number of Training Brains : 1 Reset Parameters :

Unity brain name: MazeLearningBrain Number of Visual Observations (per agent): 0 Vector Observation space size (per agent): 80 Number of stacked Vector Observation: 1 Vector Action space type: continuous Vector Action space size (per agent): [5] Vector Action descriptions: , , , , Traceback (most recent call last): File "/home/jubei/.local/bin/mlagents-learn", line 11, in load_entry_point('mlagents', 'console_scripts', 'mlagents-learn')() File "/home/jubei/coding/ml-agents/ml-agents/mlagents/trainers/learn.py", line 337, in main run_training(0, run_seed, options, Queue()) File "/home/jubei/coding/ml-agents/ml-agents/mlagents/trainers/learn.py", line 110, in run_training multi_gpu, File "/home/jubei/coding/ml-agents/ml-agents/mlagents/trainers/trainer_util.py", line 89, in initialize_trainers multi_gpu, File "/home/jubei/coding/ml-agents/ml-agents/mlagents/trainers/ppo/trainer.py", line 47, in init brain, trainer_parameters, training, run_id, reward_buff_cap File "/home/jubei/coding/ml-agents/ml-agents/mlagents/trainers/rl_trainer.py", line 41, in init if not self.trainer_parameters["reward_signals"]: KeyError: 'reward_signals'

This project of ours was made to run with an older branch "barracuda-test-0.2.0". The error above showed up when we tried to migrate that project to the latest version of mlagents. So essentially we moved that project from barracuda-test-0.2.0 to the latest stable mlagents (0.9.3)

Could it be that the brains we created as "assets" inside the unity project were old? From that older version?

ervteng commented 5 years ago

Hi @joobei, you need to define at least one reward signal in your trainer_config.yaml file. Something like:

reward_signal:
    extrinsic:
            strength: 1.0
            gamma: 0.99

In the examples, it works because there is a default config provided at the top of the YAML. You could also copy/paste this default config to the top of your yaml.

joobei commented 5 years ago

@ervteng that solved the problem. We were using an older config file. Many thanks for the quick support!

Numan4221 commented 4 years ago

Hi I have more or less the same problem but with KeyError: 'default'. Its my first time so im not sure the issue, im using mlagents 0.9.2 Traceback (most recent call last): File "d:\users\danin\anaconda3\envs\ml-agents\lib\runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "d:\users\danin\anaconda3\envs\ml-agents\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "D:\Users\danin\Anaconda3\envs\ml-agents\Scripts\mlagents-learn.exe__main__.py", line 9, in File "d:\users\danin\anaconda3\envs\ml-agents\lib\site-packages\mlagents\trainers\learn.py", line 337, in main run_training(0, run_seed, options, Queue()) File "d:\users\danin\anaconda3\envs\ml-agents\lib\site-packages\mlagents\trainers\learn.py", line 110, in run_training multi_gpu, File "d:\users\danin\anaconda3\envs\ml-agents\lib\site-packages\mlagents\trainers\trainer_util.py", line 45, in initialize_trainers trainer_parameters = trainer_config["default"].copy() KeyError: 'default'

NilsMoller commented 4 years ago

Hey. I am having a similar issue but with the newer versions a lot has changed and could not figure it out. Here is the console output: image Any help would be appreciated! PS. I am a noob so please forgive any stupidity :) Ayden

chriselion commented 4 years ago

@Numan4221 You need to have a "default" section in your trainer config's yaml file. See https://github.com/Unity-Technologies/ml-agents/blob/master/config/trainer_config.yaml#L1 for an example

@AydenWasTaken Can you post the contents of the curriculum folder and file? I think it's expecting a "Chameleon Behavior.json" file there, but you're probably going to have an easier time if you remove the space from the behavior name.

NilsMoller commented 4 years ago

image image

@chriselion This is the second ml agents project im trying and the first had "My behavior" (the default) and it was fine when the folder was called "penguin" and the file "PenguinLearning.json" so i don't see how the two are related. Could you clarify on what exactly this behavior name is used for?

chriselion commented 4 years ago

Here's where we parse the curriculum directory: https://github.com/Unity-Technologies/ml-agents/blob/master/ml-agents/mlagents/trainers/meta_curriculum.py#L56 Basically the key for the dictionary is the filename.

The behavior name is used as the key to that dictionary here: https://github.com/Unity-Technologies/ml-agents/blob/master/ml-agents/mlagents/trainers/trainer_util.py#L107-L109

It sounds like the dictionary has key "ChameleonLearning" (no space) but it's trying to look up "Chameleon Learning" (with a space).

We definitely need to make the error handling better here. and maybe handle unknown behavior names.

NilsMoller commented 4 years ago

@chriselion Changing the behavior name to match the filename worked wonders, thank you! Have a very good day

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.