Training issues (can't train anymore)

ThalianMF commented 2 years ago

Hello everyone !

I am fairly new to the ML-Agents usage and toolkit so am I to Python. However, the training worked fine yesterday but today when I was going to try to train my own agent, it didn't worked. Yesterday I successfully trained it without facing any issues. Today, it seems this doesn't work...

I have tried to reinstall Pytorch and ml-agents without success. I also uninstalled and reinstalled a different version of Python : 3.7.9 to 3.8.12 to 3.10.4 (I’m aware that installing Python 3.6 or 3.7 is recommanded) I installed everything in a virtual environment. I followed the same steps in the installation guide : )

I also tried this on another computer that I haven't used in months. I'm facing the same issues. I used for both computers those two versions: Python: 3.7.9 Torch: 1.7.1+cu110

(venv) C:\Unity\Projet Applicatif\MLAgents Staircase\venv\Scripts>mlagents-learn C:\Users\thali\Unity\ML_PA_V2\Collector.yml --run-id Collect_00
Traceback (most recent call last):
  File "C:\Users\thali\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\thali\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Unity\Projet Applicatif\MLAgents Staircase\venv\Scripts\mlagents-learn.exe\__main__.py", line 4, in <module>
  File "C:\Unity\Projet Applicatif\MLAgents Staircase\venv\lib\site-packages\mlagents\trainers\learn.py", line 2, in <module>
    from mlagents import torch_utils
  File "C:\Unity\Projet Applicatif\MLAgents Staircase\venv\lib\site-packages\mlagents\torch_utils\__init__.py", line 1, in <module>
    from mlagents.torch_utils.torch import torch as torch  # noqa
  File "C:\Unity\Projet Applicatif\MLAgents Staircase\venv\lib\site-packages\mlagents\torch_utils\torch.py", line 6, in <module>
    from mlagents.trainers.settings import TorchSettings
  File "C:\Unity\Projet Applicatif\MLAgents Staircase\venv\lib\site-packages\mlagents\trainers\settings.py", line 25, in <module>
    from mlagents.trainers.cli_utils import StoreConfigFile, DetectDefault, parser
  File "C:\Unity\Projet Applicatif\MLAgents Staircase\venv\lib\site-packages\mlagents\trainers\cli_utils.py", line 5, in <module>
    from mlagents_envs.environment import UnityEnvironment
  File "C:\Unity\Projet Applicatif\MLAgents Staircase\venv\lib\site-packages\mlagents_envs\environment.py", line 12, in <module>
    from mlagents_envs.side_channel.side_channel import SideChannel
  File "C:\Unity\Projet Applicatif\MLAgents Staircase\venv\lib\site-packages\mlagents_envs\side_channel\__init__.py", line 5, in <module>
    from mlagents_envs.side_channel.default_training_analytics_side_channel import (  # noqa
  File "C:\Unity\Projet Applicatif\MLAgents Staircase\venv\lib\site-packages\mlagents_envs\side_channel\default_training_analytics_side_channel.py", line 7, in <module>
    from mlagents_envs.communicator_objects.training_analytics_pb2 import (
  File "C:\Unity\Projet Applicatif\MLAgents Staircase\venv\lib\site-packages\mlagents_envs\communicator_objects\training_analytics_pb2.py", line 41, in <module>
    options=None, file=DESCRIPTOR),
  File "C:\Unity\Projet Applicatif\MLAgents Staircase\venv\lib\site-packages\google\protobuf\descriptor.py", line 560, in __new__
    _message.Me### ssage._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

Is there anyone that have an idea of what issue I am facing? Thank you in advance for your possible involvement !

Zibelas commented 2 years ago

I had the same problem today, pip3 install --upgrade protobuf==3.20.0 solved that particular error for me

ThalianMF commented 2 years ago

It seems to work now. I'm facing some other issues after the end of the training but it's not related to that. Thanks again ;)

AndresOrdonez369 commented 2 years ago

TypeError: Invalid first argument to register(). typing.Dict[mlagents.trainers.settings.RewardSignalType, mlagents.trainers.settings.RewardSignalSettings] is not a class. I have this error now, anyone can help me?

github-actions[bot] commented 2 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Unity-Technologies / ml-agents

Training issues (can't train anymore) #5753