TimZaman / dotaclient

distributed RL spaghetti al arabiata
26 stars 7 forks source link

Optimizer ValueError #42

Closed Nostrademous closed 5 years ago

Nostrademous commented 5 years ago

I kicked off a local run on Friday afternoon with the current HEAD at the time and let it run over the weekend. Sunday night I noticed that the optimizer had seemed to die about 2 hours after starting with the error below:

2019-02-15 15:37:57,825 INFO     steps_per_s=25.76, avg_weight_age=1.0, reward_per_sec=-0.0000, loss=-0.1758, entropy=5.023, advantage=0.018
2019-02-15 15:37:57,878 INFO     iteration 27/10000
2019-02-15 15:40:09,232 INFO      epoch 1/4
2019-02-15 15:40:12,433 INFO      epoch 2/4
2019-02-15 15:40:15,710 INFO      epoch 3/4
Traceback (most recent call last):
  File "optimizer.py", line 782, in <module>
    run_local=args.run_local,
  File "optimizer.py", line 736, in main
    dota_optimizer.run()
  File "optimizer.py", line 461, in run
    loss_d, entropy_d, advantage = self.train(experiences=batch)
  File "optimizer.py", line 646, in train
    loss, policy_loss, entropy_loss, advantage_loss))
ValueError: loss=nan, policy_loss=nan, entropy_loss=nan, advantage_loss=nan

The error above is one issue. The 2nd issue is that the agent kept running for 2 more days (as well as the Dota Service) without a hiccup even though the Optimizer went down. The Agent just would continue to re-use the same weights version, over and over and over.

TimZaman commented 5 years ago

I improved stability greatly ever since