Zeta36 / connect4-alpha-zero

Connect4 reinforcement learning by AlphaGo Zero methods.
MIT License
114 stars 38 forks source link

JSON Decode Error #3

Open brianprichardson opened 6 years ago

brianprichardson commented 6 years ago

Ubuntu 16.04 LTS venv eraseme tensorflow-gpu all 3 workers running run self --type mini encounters the error looks like shortly after 2017-12-09 01:26:06,376 able to restart run self and it seems fine. part of log follows 2017-12-09 01:26:05,110@connect4_zero.worker.self_play DEBUG # game 8999 time=0.44211292266845703 sec, turn=12:X XOOOX X XOX O O - Winner:Winner.black 2017-12-09 01:26:06,283@connect4_zero.worker.self_play INFO # save play data to /home/brianr/eraseme/connect4-alpha-zero/data/play_data/play_20171209-012606.282920.json 2017-12-09 01:26:06,375@connect4_zero.worker.self_play DEBUG # game 9000 time=1.2644462585449219 sec, turn=23:XXOXOXXXXXXO OOO O XOO OX XO - Winner:Winner.white 2017-12-09 01:26:06,375@connect4_zero.lib.model_helpler DEBUG # start reload the best model if changed 2017-12-09 01:26:06,376@connect4_zero.agent.model_connect4 DEBUG # loading model from /home/brianr/eraseme/connect4-alpha-zero/data/model/model_best_config.json Traceback (most recent call last): File "src/connect4_zero/run.py", line 17, in <module> manager.start() File "src/connect4_zero/manager.py", line 42, in start return self_play.start(config) File "src/connect4_zero/worker/self_play.py", line 19, in start return SelfPlayWorker(config, env=Connect4Env()).start() File "src/connect4_zero/worker/self_play.py", line 51, in start reload_best_model_weight_if_changed(self.model) File "src/connect4_zero/lib/model_helpler.py", line 33, in reload_best_model_weight_if_changed return load_best_model_weight(model) File "src/connect4_zero/lib/model_helpler.py", line 12, in load_best_model_weight return model.load(model.config.resource.model_best_config_path, model.config.resource.model_best_weight_path) File "src/connect4_zero/agent/model_connect4.py", line 86, in load self.model = Model.from_config(json.load(f)) File "/usr/lib/python3.6/json/__init__.py", line 299, in load parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw) File "/usr/lib/python3.6/json/__init__.py", line 354, in loads return _default_decoder.decode(s) File "/usr/lib/python3.6/json/decoder.py", line 339, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python3.6/json/decoder.py", line 355, in raw_decode obj, end = self.scan_once(s, idx) json.decoder.JSONDecodeError: Expecting ':' delimiter: line 1 column 8194 (char 8193)

2017-12-09 01:26:05,050@connect4_zero.worker.evaluate DEBUG # game 6: ng_win=1 white_is_best_model=True winning rate 85.7% 2017-12-09 01:26:05,050@connect4_zero.worker.evaluate DEBUG # win count reach 6 so change best model 2017-12-09 01:26:05,051@connect4_zero.worker.evaluate DEBUG # winning rate 85.7% 2017-12-09 01:26:05,051@connect4_zero.worker.evaluate DEBUG # New Model become best model: /home/brianr/eraseme/connect4-alpha-zero/data/model/next_generation/model_20171209-012509.335592 2017-12-09 01:26:05,051@connect4_zero.agent.model_connect4 DEBUG # save model to /home/brianr/eraseme/connect4-alpha-zero/data/model/model_best_config.json 2017-12-09 01:26:05,110@connect4_zero.worker.self_play DEBUG # game 8999 time=0.44211292266845703 sec, turn=12:X XOOOX X XOX O O - Winner:Winner.black 2017-12-09 01:26:05,594@connect4_zero.worker.optimize DEBUG # total step=60939, set learning rate to 0.01 2017-12-09 01:26:06,283@connect4_zero.worker.self_play INFO # save play data to /home/brianr/eraseme/connect4-alpha-zero/data/play_data/play_20171209-012606.282920.json 2017-12-09 01:26:06,375@connect4_zero.worker.self_play DEBUG # game 9000 time=1.2644462585449219 sec, turn=23:XXOXOXXXXXXO OOO O XOO OX XO - Winner:Winner.white 2017-12-09 01:26:06,375@connect4_zero.lib.model_helpler DEBUG # start reload the best model if changed 2017-12-09 01:26:06,376@connect4_zero.agent.model_connect4 DEBUG # loading model from /home/brianr/eraseme/connect4-alpha-zero/data/model/model_best_config.json 2017-12-09 01:26:06,604@connect4_zero.worker.optimize DEBUG # total step=60948, set learning rate to 0.01 2017-12-09 01:26:07,649@connect4_zero.worker.optimize DEBUG # total step=60957, set learning rate to 0.01 2017-12-09 01:26:08,764@connect4_zero.worker.optimize DEBUG # total step=60966, set learning rate to 0.01 2017-12-09 01:26:09,886@connect4_zero.worker.optimize DEBUG # total step=60975, set learning rate to 0.01 2017-12-09 01:26:10,180@connect4_zero.agent.model_connect4 DEBUG # saved model digest 79319e815c4bfee184feae824afe333ed4eaa8533ec1dfd4809c2714d80b0cbb 2017-12-09 01:26:10,180@connect4_zero.worker.evaluate INFO # There is no next generation model to evaluate 2017-12-09 01:26:10,901@connect4_zero.worker.optimize DEBUG # total step=60984, set learning rate to 0.01 2017-12-09 01:26:11,919@connect4_zero.worker.optimize DEBUG # total step=60993, set learning rate to 0.01 2017-12-09 01:26:12,986@connect4_zero.worker.optimize DEBUG # total step=61002, set learning rate to 0.01 2017-12-09 01:26:13,987@connect4_zero.worker.optimize DEBUG # total step=61011, set learning rate to 0.01 2017-12-09 01:26:15,048@connect4_zero.worker.optimize DEBUG # total step=61020, set learning rate to 0.01 2017-12-09 01:26:16,055@connect4_zero.worker.optimize DEBUG # total step=61029, set learning rate to 0.01 2017-12-09 01:26:17,099@connect4_zero.worker.optimize DEBUG # total step=61038, set learning rate to 0.01 2017-12-09 01:26:18,099@connect4_zero.worker.optimize DEBUG # total step=61047, set learning rate to 0.01 2017-12-09 01:26:19,169@connect4_zero.worker.optimize DEBUG # total step=61056, set learning rate to 0.01 2017-12-09 01:26:20,287@connect4_zero.worker.optimize DEBUG # total step=61065, set learning rate to 0.01 2017-12-09 01:26:21,311@connect4_zero.worker.optimize DEBUG # total step=61074, set learning rate to 0.01 2017-12-09 01:26:22,290@connect4_zero.worker.optimize DEBUG # total step=61083, set learning rate to 0.01 2017-12-09 01:26:23,324@connect4_zero.worker.optimize DEBUG # total step=61092, set learning rate to 0.01 2017-12-09 01:26:24,348@connect4_zero.worker.optimize DEBUG # total step=61101, set learning rate to 0.01 2017-12-09 01:26:25,340@connect4_zero.worker.optimize DEBUG # total step=61110, set learning rate to 0.01 2017-12-09 01:26:26,348@connect4_zero.worker.optimize DEBUG # total step=61119, set learning rate to 0.01 2017-12-09 01:26:27,426@connect4_zero.worker.optimize DEBUG # total step=61128, set learning rate to 0.01 2017-12-09 01:26:28,447@connect4_zero.worker.optimize DEBUG # total step=61137, set learning rate to 0.01 2017-12-09 01:26:29,479@connect4_zero.worker.optimize DEBUG # total step=61146, set learning rate to 0.01 2017-12-09 01:26:30,477@connect4_zero.worker.optimize DEBUG # total step=61155, set learning rate to 0.01 2017-12-09 01:26:31,463@connect4_zero.worker.optimize DEBUG # total step=61164, set learning rate to 0.01 2017-12-09 01:26:32,451@connect4_zero.worker.optimize DEBUG # total step=61173, set learning rate to 0.01 2017-12-09 01:26:33,432@connect4_zero.worker.optimize DEBUG # total step=61182, set learning rate to 0.01 2017-12-09 01:26:34,451@connect4_zero.worker.optimize DEBUG # total step=61191, set learning rate to 0.01 2017-12-09 01:26:35,612@connect4_zero.worker.optimize DEBUG # total step=61200, set learning rate to 0.01 2017-12-09 01:26:36,631@connect4_zero.worker.optimize DEBUG # total step=61209, set learning rate to 0.01 2017-12-09 01:26:37,683@connect4_zero.worker.optimize DEBUG # total step=61218, set learning rate to 0.01 2017-12-09 01:26:38,686@connect4_zero.worker.optimize DEBUG # total step=61227, set learning rate to 0.01 2017-12-09 01:26:39,676@connect4_zero.worker.optimize DEBUG # total step=61236, set learning rate to 0.01 2017-12-09 01:26:40,689@connect4_zero.worker.optimize DEBUG # total step=61245, set learning rate to 0.01 2017-12-09 01:26:41,721@connect4_zero.worker.optimize DEBUG # total step=61254, set learning rate to 0.01 2017-12-09 01:26:42,736@connect4_zero.worker.optimize DEBUG # total step=61263, set learning rate to 0.01 2017-12-09 01:26:43,839@connect4_zero.worker.optimize DEBUG # total step=61272, set learning rate to 0.01 2017-12-09 01:26:44,845@connect4_zero.worker.optimize DEBUG # total step=61281, set learning rate to 0.01 2017-12-09 01:26:45,958@connect4_zero.worker.optimize DEBUG # total step=61290, set learning rate to 0.01 2017-12-09 01:26:47,051@connect4_zero.worker.optimize DEBUG # total step=61299, set learning rate to 0.01 2017-12-09 01:26:48,097@connect4_zero.worker.optimize DEBUG # total step=61308, set learning rate to 0.01 2017-12-09 01:26:49,092@connect4_zero.worker.optimize DEBUG # total step=61317, set learning rate to 0.01 2017-12-09 01:26:50,120@connect4_zero.worker.optimize DEBUG # total step=61326, set learning rate to 0.01 2017-12-09 01:26:51,107@connect4_zero.worker.optimize DEBUG # total step=61335, set learning rate to 0.01 2017-12-09 01:26:52,238@connect4_zero.worker.optimize DEBUG # total step=61344, set learning rate to 0.01 2017-12-09 01:26:53,355@connect4_zero.worker.optimize DEBUG # total step=61353, set learning rate to 0.01 2017-12-09 01:26:54,373@connect4_zero.worker.optimize DEBUG # total step=61362, set learning rate to 0.01 2017-12-09 01:26:55,390@connect4_zero.worker.optimize DEBUG # total step=61371, set learning rate to 0.01 2017-12-09 01:26:56,362@connect4_zero.worker.optimize DEBUG # total step=61380, set learning rate to 0.01 2017-12-09 01:26:57,447@connect4_zero.worker.optimize DEBUG # total step=61389, set learning rate to 0.01 2017-12-09 01:26:58,422@connect4_zero.worker.optimize DEBUG # total step=61398, set learning rate to 0.01 2017-12-09 01:26:59,387@connect4_zero.worker.optimize DEBUG # total step=61407, set learning rate to 0.01 2017-12-09 01:27:00,391@connect4_zero.worker.optimize DEBUG # total step=61416, set learning rate to 0.01 2017-12-09 01:27:01,421@connect4_zero.worker.optimize DEBUG # total step=61425, set learning rate to 0.01 2017-12-09 01:27:02,373@connect4_zero.worker.optimize DEBUG # total step=61434, set learning rate to 0.01 2017-12-09 01:27:03,341@connect4_zero.worker.optimize DEBUG # total step=61443, set learning rate to 0.01 2017-12-09 01:27:04,395@connect4_zero.worker.optimize DEBUG # total step=61452, set learning rate to 0.01 2017-12-09 01:27:05,579@connect4_zero.worker.optimize DEBUG # total step=61461, set learning rate to 0.01 2017-12-09 01:27:06,550@connect4_zero.worker.optimize DEBUG # total step=61470, set learning rate to 0.01 2017-12-09 01:27:07,567@connect4_zero.worker.optimize DEBUG # total step=61479, set learning rate to 0.01 2017-12-09 01:27:08,662@connect4_zero.agent.model_connect4 DEBUG # save model to /home/brianr/eraseme/connect4-alpha-zero/data/model/next_generation/model_20171209-012708.662464/model_config.json 2017-12-09 01:27:08,720@connect4_zero.agent.model_connect4 DEBUG # saved model digest 728fd4e27242549e270a7743747e0aedabae246aad2b25050c2a426123312cf3 2017-12-09 01:27:08,720@connect4_zero.worker.optimize DEBUG # loading data from /home/brianr/eraseme/connect4-alpha-zero/data/play_data/play_20171209-012606.282920.json 2017-12-09 01:27:09,196@connect4_zero.worker.optimize DEBUG # removing data about /home/brianr/eraseme/connect4-alpha-zero/data/play_data/play_20171209-011137.511820.json from training set 2017-12-09 01:27:09,196@connect4_zero.worker.optimize DEBUG # updating training dataset 2017-12-09 01:27:09,204@connect4_zero.worker.optimize DEBUG # total step=61488, set learning rate to 0.01 2017-12-09 01:27:10,177@connect4_zero.worker.optimize DEBUG # total step=61497, set learning rate to 0.01 2017-12-09 01:27:10,237@connect4_zero.agent.model_connect4 DEBUG # loading model from /home/brianr/eraseme/connect4-alpha-zero/data/model/next_generation/model_20171209-012708.662464/model_config.json 2017-12-09 01:27:11,272@connect4_zero.worker.optimize DEBUG # total step=61506, set learning rate to 0.01 2017-12-09 01:27:12,259@connect4_zero.worker.optimize DEBUG # total step=61515, set learning rate to 0.01 2017-12-09 01:27:13,047@connect4_zero.agent.model_connect4 DEBUG # loaded model digest = 728fd4e27242549e270a7743747e0aedabae246aad2b25050c2a426123312cf3 2017-12-09 01:27:13,048@connect4_zero.worker.evaluate DEBUG # start evaluate model /home/brianr/eraseme/connect4-alpha-zero/data/model/next_generation/model_20171209-012708.662464 2017-12-09 01:27:13,232@connect4_zero.worker.optimize DEBUG # total step=61524, set learning rate to 0.01 2017-12-09 01:27:14,179@connect4_zero.worker.optimize DEBUG # total step=61533, set learning rate to 0.01 2017-12-09 01:27:15,064@connect4_zero.worker.evaluate DEBUG # game 0: ng_win=0 white_is_best_model=True winning rate 0.0% 2017-12-09 01:27:15,133@connect4_zero.worker.optimize DEBUG # total step=61542, set learning rate to 0.01 2017-12-09 01:27:15,397@connect4_zero.worker.evaluate DEBUG # game 1: ng_win=0 white_is_best_model=True winning rate 0.0% 2017-12-09 01:27:16,323@connect4_zero.worker.optimize DEBUG # total step=61551, set learning rate to 0.01 2017-12-09 01:27:16,367@connect4_zero.worker.evaluate DEBUG # game 2: ng_win=0 white_is_best_model=False winning rate 0.0% 2017-12-09 01:27:16,601@connect4_zero.worker.evaluate DEBUG # game 3: ng_win=0 white_is_best_model=True winning rate 0.0% 2017-12-09 01:27:16,889@connect4_zero.worker.evaluate DEBUG # game 4: ng_win=0 white_is_best_model=True winning rate 0.0% 2017-12-09 01:27:16,889@connect4_zero.worker.evaluate DEBUG # lose count reach 5 so give up challenge 2017-12-09 01:27:16,889@connect4_zero.worker.evaluate DEBUG # winning rate 0.0% 2017-12-09 01:27:16,890@connect4_zero.worker.evaluate INFO # There is no next generation model to evaluate 2017-12-09 01:27:17,410@connect4_zero.worker.optimize DEBUG # total step=61560, set learning rate to 0.01 2017-12-09 01:27:18,420@connect4_zero.worker.optimize DEBUG # total step=61569, set learning rate to 0.01 2017-12-09 01:27:19,393@connect4_zero.worker.optimize DEBUG # total step=61578, set learning rate to 0.01 2017-12-09 01:27:20,416@connect4_zero.worker.optimize DEBUG # total step=61587, set learning rate to 0.01 2017-12-09 01:27:21,600@connect4_zero.worker.optimize DEBUG # total step=61596, set learning rate to 0.01