CityBrainChallenge / KDDCup2021-CityBrainChallenge-starter-kit

77 stars 40 forks source link

Resource temporarily unavailable during training phase #48

Closed Nntraveler closed 3 years ago

Nntraveler commented 3 years ago

Out team tried to train about 8 models in parallel. The command is:

python3 train.py --episodes 600 --metric_period 200 --steps 360 --thread 60

After about 30 episodes, an error occured in part of these processes:

----------------------------------------------------35/600
start_time_epoch = 0
max_time_epoch = 3600
report_log_rate = 10
warning_stop_time_log = 100
Traceback (most recent call last):
  File "train.py", line 968, in <module>
    scores_dir,
  File "train.py", line 512, in train
    observations, rewards, dones, infos = env.step(actions_)
  File "/usr/local/lib/python3.7/dist-packages/CBEngine/envs/CBEngine.py", line 168, in step
    self.eng.next_step()
RuntimeError: Resource temporarily unavailable
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "train.py", line 981, in <module>
    result["success"] = True
AssertionError

Before that, using --thread 360 produced the error. After switching to --thread 60, it still happens. Thanks a lot.