Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
17.12k stars 4.15k forks source link

Python crashes when training. #5229

Closed George056 closed 2 years ago

George056 commented 3 years ago

When training I receive an error but training continues, but python then crashes after training continues for a few minutes. Also, the summery folder is not generated and the obtained reward is not printed. I am having this bug on my custom environment that is using discrete actions and is manually being called. My training configuration follows.

behaviors:
    Node_AI:
        trainer_type: sac
        summary_freq: 50000
        time_horizon: 128
        max_steps: 5.0e6
        keep_checkpoints: 5
        checkpoint_interval: 500000
        init_path: null
        threaded: true
        hyperparameters:
            learning_rate: 3e-4
            batch_size: 100 #this is a guess avg is 32 - 512
            buffer_size: 50000
            learning_rate_schedule: constant
            buffer_init_steps: 0
            init_entcoef: 0.5
            save_replay_buffer: true
            tau: 0.005
            steps_per_update: 1
        network_settings:
            hidden_units: 256
            num_layers: 2 #typical is 1 - 3
            normalize: false
            vis_encoder_type: match3
        reward_signals:
            extrinsic:
                gamma: 0.99
                strength: 1.0
            curiosity:
                strength: 0.05
                gamma: 0.99
        self_play:
            save_steps: 20000
            team_change: 80000
            swap_steps: 5000
            play_against_latest_model_ratio: 0.5
            window: 10
c:\users\capstone\.conda\envs\ml-agents-node\lib\site-packages\mlagents\trainers\torch\utils.py:242: UserWarning: This overload of nonzero is deprecated:
nonzero()
Consider using one of the following signatures instead:
nonzero(*, bool as_tuple) (Triggered internally at ..\torch\csrc\utils\python_arg_parser.cpp:882.)
res += [data[(partitions == i).nonzero().squeeze(1)]]
andrewcoh commented 3 years ago

Hi @George056

This is not a bug, just a warning that we are using a torch function signature of nonzero that is deprecated.

However, crashing after a few minutes sounds like an issue. Can you share the error logs?

George056 commented 3 years ago

I have not seen any error logs, when it crashes python freezes.

andrewcoh commented 3 years ago

Does it crash on the Unity side and then the python side times out (UnityTimeoutException) or does everything freeze? I'm assuming the editor console logs are empty too?

I'm not sure where the bug could be originating. It might help to remove some features (curiosity and self-play) to determine if it's something in the trainers or just C#.

George056 commented 3 years ago

Unity temporarily stops but then goes back to playing using the heuristic.

George056 commented 3 years ago

I do know that I get a lot of warnings in Unity due to the collect observations not just being called when it is not that AI's turn to make a move.

George056 commented 3 years ago

I just removed the lines that caused all of the warning and it still froze. What exactly happens is that at the end of a game, and I have only seen it happen at the end of a game, is that Unity freezes and then python says that the environment has stopped. I the return to Unity and it starts to play using the heuristic and python is frozen.

andrewcoh commented 3 years ago

Is there anything in the player logs located in results/<run-id>/run_logs/Player-x.log?

This is only happening on Reset? Do you know if the call to OnEpisodeBegin for the next game is occurring?

Also, can you clarify what you mean by "manually being called" in the initial post. Are you using RequestDecision?

George056 commented 3 years ago

No log files are made, just 2 json files (timers and training_status). I am calling RequestDecision, the game is turn based and I call it when it is this agents turn.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had activity in the last 28 days. It will be closed in the next 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] commented 2 years ago

This issue has been automatically closed because it has not had activity in the last 42 days. If this issue is still valid, please ping a maintainer. Thank you for your contributions.

github-actions[bot] commented 2 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.