VanIseghemThomas / AI-Parking-Unity

A RL project focussed on autonomous parking, using Unity's MLAgents toolkit.
47 stars 8 forks source link

Demo paths are causing this repo to be unusable #1

Closed nubonics closed 2 years ago

nubonics commented 2 years ago

I've attempted to clone this repo as I am interested in MLAgents and have not found an up2date tutorial but it seems that I cant start the training because the demo file is lacking. I'ved tried removing it from the yaml file, but then it complains that there needs to be a demo. I've also tried just removing the path but then it complains that the path doesnt exist.

(venv) PS C:\Users\nubonix\UnityProjects\MLAgentsLearning\AI-Parking-Unity\Parking Environment> mlagents-learn .\trainer_settings.yaml --run-id AIParking --force

            ┐  ╖
        ╓╖╬│╡  ││╬╖╖
    ╓╖╬│││││┘  ╬│││││╬╖
 ╖╬│││││╬╜        ╙╬│││││╖╖                               ╗╗╗
 ╬╬╬╬╖││╦╖        ╖╬││╗╣╣╣╬      ╟╣╣╬    ╟╣╣╣             ╜╜╜  ╟╣╣
 ╬╬╬╬╬╬╬╬╖│╬╖╖╓╬╪│╓╣╣╣╣╣╣╣╬      ╟╣╣╬    ╟╣╣╣ ╒╣╣╖╗╣╣╣╗   ╣╣╣ ╣╣╣╣╣╣ ╟╣╣╖   ╣╣╣
 ╬╬╬╬┐  ╙╬╬╬╬│╓╣╣╣╝╜  ╫╣╣╣╬      ╟╣╣╬    ╟╣╣╣ ╟╣╣╣╙ ╙╣╣╣  ╣╣╣ ╙╟╣╣╜╙  ╫╣╣  ╟╣╣
 ╬╬╬╬┐     ╙╬╬╣╣      ╫╣╣╣╬      ╟╣╣╬    ╟╣╣╣ ╟╣╣╬   ╣╣╣  ╣╣╣  ╟╣╣     ╣╣╣┌╣╣╜
 ╬╬╬╜       ╬╬╣╣      ╙╝╣╣╬      ╙╣╣╣╗╖╓╗╣╣╣╜ ╟╣╣╬   ╣╣╣  ╣╣╣  ╟╣╣╦╓    ╣╣╣╣╣
 ╙   ╓╦╖    ╬╬╣╣   ╓╗╗╖            ╙╝╣╣╣╣╝╜   ╘╝╝╜   ╝╝╝  ╝╝╝   ╙╣╣╣    ╟╣╣╣
   ╩╬╬╬╬╬╬╦╦╬╬╣╣╗╣╣╣╣╣╣╣╝                                             ╫╣╣╣╣
      ╙╬╬╬╬╬╬╬╣╣╣╣╣╣╝╜
          ╙╬╬╬╣╣╣╜
             ╙

 Version information:
  ml-agents: 0.29.0,
  ml-agents-envs: 0.29.0,
  Communicator API: 1.5.0,
  PyTorch: 1.8.0+cpu
[INFO] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
[INFO] Connected to Unity environment with package version 2.0.1 and communication version 1.5.0
[INFO] Connected new brain: CarBehaviour?team=0
[WARNING] Deleting TensorBoard data events.out.tfevents.1659180778.desktopzero.33092.0 that was left over from a previous run.
[INFO] Hyperparameters for behavior name CarBehaviour:
        trainer_type:   ppo
        hyperparameters:
          batch_size:   1024
          buffer_size:  5120
          learning_rate:        0.00035
          beta: 0.0025
          epsilon:      0.3
          lambd:        0.95
          num_epoch:    5
          learning_rate_schedule:       linear
          beta_schedule:        linear
          epsilon_schedule:     linear
        network_settings:
          normalize:    True
          hidden_units: 264
          num_layers:   3
          vis_encode_type:      simple
          memory:       None
          goal_conditioning_type:       hyper
          deterministic:        False
        reward_signals:
          extrinsic:
            gamma:      0.95
            strength:   0.99
            network_settings:
              normalize:        False
              hidden_units:     128
              num_layers:       2
              vis_encode_type:  simple
              memory:   None
              goal_conditioning_type:   hyper
              deterministic:    False
          gail:
            gamma:      0.99
            strength:   0.3
            network_settings:
              normalize:        False
              hidden_units:     128
              num_layers:       2
              vis_encode_type:  simple
              memory:   None
              goal_conditioning_type:   hyper
              deterministic:    False
            learning_rate:      0.0003
            encoding_size:      None
            use_actions:        False
            use_vail:   False
            demo_path:  Demos/90dgRS3s.demo
        init_path:      None
        keep_checkpoints:       15
        checkpoint_interval:    1000000
        max_steps:      50000000
        time_horizon:   264
        summary_freq:   100000
        threaded:       True
        self_play:      None
        behavioral_cloning:
          demo_path:    Demos/90dgRS3s.demo
          steps:        750000
          strength:     0.4
          samples_per_update:   0
          num_epoch:    None
          batch_size:   None
Traceback (most recent call last):
  File "C:\Users\nubonix\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\nubonix\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\nubonix\UnityProjects\MLAgentsLearning\JustKittenAround\venv\Scripts\mlagents-learn.exe\__main__.py", line 7, in <module>
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\learn.py", line 260, in main
    run_cli(parse_command_line())
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\learn.py", line 256, in run_cli
    run_training(run_seed, options, num_areas)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\learn.py", line 132, in run_training
    tc.start_learning(env_manager)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\trainer_controller.py", line 173, in start_learning
    self._reset_env(env_manager)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\trainer_controller.py", line 107, in _reset_env
    self._register_new_behaviors(env_manager, env_manager.first_step_infos)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\trainer_controller.py", line 268, in _register_new_behaviors
    self._create_trainers_and_managers(env_manager, new_behavior_ids)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\trainer_controller.py", line 166, in _create_trainers_and_managers
    self._create_trainer_and_manager(env_manager, behavior_id)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\trainer_controller.py", line 142, in _create_trainer_and_manager
    trainer.add_policy(parsed_behavior_id, policy)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\ppo\trainer.py", line 259, in add_policy
    self.optimizer = self.create_ppo_optimizer()
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\ppo\trainer.py", line 237, in create_ppo_optimizer
    return TorchPPOOptimizer(  # type: ignore
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\ppo\optimizer_torch.py", line 28, in __init__
    super().__init__(policy, trainer_settings)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\optimizer\torch_optimizer.py", line 33, in __init__
    self.create_reward_signals(trainer_settings.reward_signals)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\optimizer\torch_optimizer.py", line 60, in create_reward_signals
    self.reward_signals[reward_signal.value] = create_reward_provider(
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\torch\components\reward_providers\reward_provider_factory.py", line 46, in create_reward_provider
    class_inst = rcls(specs, settings)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\torch\components\reward_providers\gail_reward_provider.py", line 29, in __init__
    _, self._demo_buffer = demo_to_buffer(
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\demo_loader.py", line 114, in demo_to_buffer
    behavior_spec, info_action_pair, _ = load_demonstration(file_path)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\demo_loader.py", line 184, in load_demonstration
    file_paths = get_demo_files(file_path)
  File "c:\users\nubonix\unityprojects\a.i.-jumping-cars-ml-agents-example\venv\lib\site-packages\mlagents\trainers\demo_loader.py", line 168, in get_demo_files
    raise FileNotFoundError(
FileNotFoundError: The demonstration file or directory Demos/90dgRS3s.demo does not exist.
(venv) PS C:\Users\nubonix\UnityProjects\MLAgentsLearning\AI-Parking-Unity\Parking Environment>
VanIseghemThomas commented 2 years ago

Hi Numbonics, The reason that it isn't starting is indeed because it lacks a demo file. You have a couple of solutions here. The first one is completely getting rid of any imitation learning methods (GAIL/BCO). Just removing the demo file wont work because they rely on it. Second is creating a demo file from scratch, this is not too hard and can be done inside the Unity player. You can reference my video walkthrough to do this: https://youtu.be/_Bzw2B-9QkM?t=1453

I will also update the repo to have a demo file available to use.

From your logs I can also see you are using the CPU version of PyTorch. This is not recomended and will result in much slower training. If you have a GPU in your system, try to get a GPU version working.

Hope this helps!

VanIseghemThomas commented 2 years ago

Demo file is now provided! https://github.com/VanIseghemThomas/AI-Parking-Unity/commit/e7ddd03c4cbbad6cea9b43066d8e5ba2b90b1024