An Open Source package that allows video game creators, AI researchers and hobbyists the opportunity to learn complex behaviors for their Non Player Characters or agents
I am having another issue, the rewards are always displayed as nan in the console, like this:
== Status ==
Current time: 2022-06-21 15:40:17 (running for 00:04:32.32)
Memory usage on this node: 14.3/31.3 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/16 CPUs, 1.0/1 GPUs, 0.0/13.01 GiB heap, 0.0/6.5 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/ls11det/ray_results/PPO/editor
Number of trials: 1/1 (1 RUNNING)
+-----------------------+----------+-----------------------+--------+------------------+------+----------+----------------------+----------------------+--------------------+
| Trial name | status | loc | iter | total time (s) | ts | reward | episode_reward_max | episode_reward_min | episode_len_mean |
|-----------------------+----------+-----------------------+--------+------------------+------+----------+----------------------+----------------------+--------------------|
| PPO_godot_0dbb4_00000 | RUNNING | 129.217.38.190:865027 | 3 | 208.046 | 3072 | nan | nan | nan | nan |
+-----------------------+----------+-----------------------+--------+------------------+------+----------+----------------------+----------------------+--------------------+
I even tried just giving back a number as reward to see if any of my code was causing the issue, but it is still displayed as nan:
func get_reward():
# What behavior do you want to reward, kills? penalties for death, key waypoints
return 0.5
I also printed in the sync.gd script where it collects and sends the reward and it picks up the 0.5 correctly.
Is there anything I am missing here?
Hello,
I am having another issue, the rewards are always displayed as nan in the console, like this:
I even tried just giving back a number as reward to see if any of my code was causing the issue, but it is still displayed as nan:
I also printed in the sync.gd script where it collects and sends the reward and it picks up the 0.5 correctly. Is there anything I am missing here?