PathmindAI / nativerl

Train reinforcement learning agents using AnyLogic or Python-based simulations
Apache License 2.0
19 stars 4 forks source link

Training fails with multi mouse and cheese - `Received a label value of 4 which is outside the valid range of ...` #494

Closed slinlee closed 3 years ago

slinlee commented 3 years ago

File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach for item in it: [Previous line repeated 1 more time] File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 471, in base_iterator yield ray.get(futures, timeout=timeout) File "/app/conda/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 47, in wrapper return func(*args, *kwargs) ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task. 2021-11-02 18:02:52,209 INFO trial_runner.py:1009 -- Trial PPO_MultiMouseAndCheese_31251_00003: Attempting to restore trial state from last checkpoint. 2021-11-02 18:02:52,222 WARNING worker.py:1115 -- Traceback (most recent call last): File "/app/conda/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1365, in _do_call return fn(args) File "/app/conda/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/app/conda/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: Received a label value of 4 which is outside the valid range of [0, 4). Label values: 4 4 4 [[{{node default_policy/SparseSoftmaxCrossEntropyWithLogits_1/SparseSoftmaxCrossEntropyWithLogits}}]]

slinlee commented 3 years ago

fixed with new conda