Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
17.18k stars 4.16k forks source link

my training stop #2089

Closed rt2017bk closed 5 years ago

rt2017bk commented 5 years ago

my training stop and I get the following error, can someone help me ???

INFO:unityagents: Ball3DBrain: Step: 4000. Mean Reward: 5.405. Std of Reward: 7.848. INFO:unityagents: Ball3DBrain: Step: 5000. Mean Reward: 17.955. Std of Reward: 17.415. Traceback (most recent call last): File "python/learn.py", line 62, in tc.start_learning() File "C:\Users\pc\Desktop \ml-agents-0.3.1a\python\unitytrainers\trainer_controller.py", line 259, in start_learning trainer.update_model() File " C:\Users\pc\Desktop \ml-agents-0.3.1a\python\unitytrainers\ppo\trainer.py", line 360, in update_model self.model.returns_holder: np.array(_buffer['discounted_returns'][start:end]).reshape( ValueError: could not broadcast input array from shape (1001) into shape (1)

shihzy commented 5 years ago

hi @rt2017bk - can you provide a bit more detail on how you are training. Can you also provide the python mlagents-learn command you are using?

shihzy commented 5 years ago

Thanks for reaching out to us. We are closing this due to inactivity, but if you need additional assistance, feel free to reopen the issue.

Lucci93 commented 5 years ago

Hi, I reopen the issue because I have the same problem. Could you help me? I used the ML-Agent library for a project in October 2018 with Unity 2018.2.2, Python 3.6.6 and the 0.4.0b version of the library with my custom game. Today I have tried to train the AI of the project with the same configuration of October 2018 but I have got the following error:

escapist 16 ~/Desktop/TOGEXP/run $ python3 ../python/learn.py --run-id=testMC --train /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:521: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)])

UNITY LOGO HERE

INFO:unityagents:{'--curriculum': 'None', '--docker-target-name': 'Empty', '--help': False, '--keep-checkpoints': '5', '--lesson': '0', '--load': False, '--no-graphics': False, '--run-id': 'testMC', '--save-freq': '50000', '--seed': '-1', '--slow': False, '--train': True, '--worker-id': '0', '': None} INFO:unityagents:Start training by pressing the Play button in the Unity Editor. INFO:unityagents: 'Academy' started successfully! Unity Academy name: Academy Number of Brains: 1 Number of External Brains : 1 Lesson number : 0 Reset Parameters :

Unity brain name: SurvivalBrainComp Number of Visual Observations (per agent): 0 Vector Observation space type: continuous Vector Observation space size (per agent): 108 Number of stacked Vector Observation: 25 Vector Action space type: continuous Vector Action space size (per agent): 5 Vector Action descriptions: Forward, Back, Fire, Left, Right /Users/dario/Desktop/TOGEXP/python/unitytrainers/trainer_controller.py:194: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. trainer_config = yaml.load(data_file) 2019-10-30 11:13:15.188220: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA INFO:unityagents:Hyperparameters for the PPO Trainer of brain SurvivalBrainComp: batch_size: 2048 beta: 0.001 buffer_size: 20480 epsilon: 0.2 gamma: 0.9 hidden_units: 128 lambd: 0.93 learning_rate: 0.0001 max_steps: 5.0e5 normalize: True num_epoch: 5 num_layers: 2 time_horizon: 256 sequence_length: 64 summary_freq: 2000 use_recurrent: True graph_scope: summary_path: ./summaries/testMC memory_size: 256 use_curiosity: False curiosity_strength: 0.01 curiosity_enc_size: 128 Traceback (most recent call last): File "../python/learn.py", line 71, in tc.start_learning() File "/Users/dario/Desktop/TOGEXP/python/unitytrainers/trainer_controller.py", line 264, in start_learning trainer.update_model() File "/Users/dario/Desktop/TOGEXP/python/unitytrainers/ppo/trainer.py", line 453, in update_model self.model.returns_holder: np.array(buffer['discounted_returns'][start:end]).flatten(), ValueError: could not broadcast input array from shape (64,1,257) into shape (64,1)

shihzy commented 5 years ago

hi @Lucci93 - we are no longer supporting that version of ML-Agents. Can you try using the latest version to see if you are still getting these errors?

Lucci93 commented 5 years ago

I will try, but I think to reproduce my experiments I need to re-run the training with the 0.4.0b version of the code. Is it possible that some library is no longer backwards compatible with my project?

shihzy commented 5 years ago

hi @Lucci93 - i think without knowing what exactly was changed in your setup, it would be difficult to debug. If nothing was touched or updated, it would run as expected.

Lucci93 commented 5 years ago

Hi @unityjeffrey. In October 2018 I ran the requirement.txt file with the actual version of the libraries written inside. Using the libraries written in the file today is probably the reason why my code doesn't work. Is the only thing I have re-imported in my project. Unity, python and Ml-Agent versions are the same. In support of my thesis even the examples inside the ML-Agent folder doesn't work with a clear installation following the specifications of the guide written for the version 0.4.0b. Do you have any idea?

NFO:unityagents:Hyperparameters for the PPO Trainer of brain Ball3DBrain: batch_size: 64 beta: 0.001 buffer_size: 12000 epsilon: 0.2 gamma: 0.995 hidden_units: 128 lambd: 0.99 learning_rate: 0.0003 max_steps: 5.0e4 normalize: True num_epoch: 3 num_layers: 2 time_horizon: 1000 sequence_length: 64 summary_freq: 1000 use_recurrent: False graph_scope: summary_path: ./summaries//Users/danielepiergigli/Desktop/test memory_size: 256 use_curiosity: False curiosity_strength: 0.01 curiosity_enc_size: 128 INFO:unityagents: Ball3DBrain: Step: 1000. Mean Reward: 1.255. Std of Reward: 0.736. INFO:unityagents: Ball3DBrain: Step: 2000. Mean Reward: 1.389. Std of Reward: 0.751. INFO:unityagents: Ball3DBrain: Step: 3000. Mean Reward: 1.677. Std of Reward: 1.003. INFO:unityagents: Ball3DBrain: Step: 4000. Mean Reward: 2.363. Std of Reward: 1.699. INFO:unityagents: Ball3DBrain: Step: 5000. Mean Reward: 3.512. Std of Reward: 2.837. INFO:unityagents: Ball3DBrain: Step: 6000. Mean Reward: 5.752. Std of Reward: 5.195. INFO:unityagents: Ball3DBrain: Step: 7000. Mean Reward: 11.211. Std of Reward: 10.930. INFO:unityagents: Ball3DBrain: Step: 8000. Mean Reward: 18.083. Std of Reward: 19.186. INFO:unityagents: Ball3DBrain: Step: 9000. Mean Reward: 23.474. Std of Reward: 22.393. Traceback (most recent call last): File "learn.py", line 84, in tc.start_learning() File "/Users/danielepiergigli/Downloads/ml-agents-0.4.0b/python/unitytrainers/trainer_controller.py", line 264, in start_learning trainer.update_model() File "/Users/danielepiergigli/Downloads/ml-agents-0.4.0b/python/unitytrainers/ppo/trainer.py", line 445, in update_model self.training_buffer.update_buffer.shuffle() File "/Users/danielepiergigli/Downloads/ml-agents-0.4.0b/python/unitytrainers/buffer.py", line 166, in shuffle raise BufferException("Unable to shuffle if the fields are not of same length") unitytrainers.buffer.BufferException: Unable to shuffle if the fields are not of same length

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.