Open alatyshe opened 6 years ago
build by cmake go_game and after that try to run sh train_df.sh and get:
PID: 18580 ========== Args ============ Loader: actor_only=False,list_file="/home/yuandong/local/go/go_gogod/train.lst",verbose=False,data_aug=-1,ratio_pre_moves=0,start_ratio_pre_moves=0.5,num_games_per_thread=5,move_cutoff=-1,mode="online",use_mcts=False,gpu=None ContextArgs: num_games=512,batchsize=128,game_multi=None,T=1,eval=False,wait_per_group=False,num_collectors=0,verbose_comm=False,verbose_collector=False,mcts_threads=0,mcts_rollout_per_thread=1,mcts_verbose=False,mcts_save_tree_filename="",mcts_verbose_time=False,mcts_use_prior=False,mcts_pseudo_games=0,mcts_pick_method="most_visited" MoreLabels: additional_labels=None MultiplePrediction: multipred_no_backprop=False Sampler: sample_policy="epsilon-greedy",greedy=False,epsilon=0.0,sample_nodes="pi,a" ModelLoader: load=None,onload=None,omit_keys=None,no_bn=False,no_leaky_relu=False,num_layer=39,dim=128 ModelInterface: opt_method="adam",lr=0.001,adam_eps=0.001 Trainer: freq_update=1 Evaluator: keys_in_reply="" Stats: trainer_stats="rewards" ModelSaver: record_dir="./record",save_prefix="save",save_dir="./",latest_symlink="latest" SingleProcessRun: num_minibatch=5000,num_episode=10000,tqdm=True ========== End of Args ============ #Game: 512 #Max_thread: 0 #Collectors: 0 T: 1 Wait per group: False Maximal #moves (0 = no constraint): 0 #Threads: 0 #Rollout per thread: 1 Verbose: False, Verbose_time: False Use prior: False Persistent tree: False #Pseudo game: 0 Pick method: most_visited Loading /home/yuandong/local/go/go_gogod/train.lst failed! Version: 6a769a02dc0ab11e5a7633c337b5d3ce0d0bf511_staged Num Actions: 361 #recv_thread = 2 Group 0: Collector[0] Batchsize: 128 Info: [gid=0][T=1][name="human_actor"] Collector[1] Batchsize: 128 Info: [gid=1][T=1][name="human_actor"] Group 1: Collector[2] Batchsize: 128 Info: [gid=2][T=1][name="actor"] Collector[3] Batchsize: 128 Info: [gid=3][T=1][name="actor"] KEY HERE : train self IDX: defaultdict(<class 'list'>, {'human_actor': [0, 1], 'actor': [2, 3]}) cb : <bound method Trainer.train of <rlpytorch.trainer.trainer.Trainer object at 0x7fe6209eeac8>> Traceback (most recent call last): File "train.py", line 29, in <module> GC.reg_callback("train", trainer.train) File "/home/ubuntu/arvi_dima/ELF/elf/utils_elf.py", line 325, in reg_callback raise ValueError("Callback[%s] is not in the specification" % key) ValueError: Callback[train] is not in the specification
in key i had "train" argument and in dict i have: defaultdict(<class 'list'>, {'human_actor': [0, 1], 'actor': [2, 3]}) Wth, and how can i fix it?
defaultdict(<class 'list'>, {'human_actor': [0, 1], 'actor': [2, 3]})
the same problem...wish someone's help.
Try adding GC.reg_callback("train", None) before GC.start() in your script.
GC.reg_callback("train", None)
GC.start()
Also ELF is updated and the codebase is here.
build by cmake go_game and after that try to run sh train_df.sh and get:
in key i had "train" argument and in dict i have:
defaultdict(<class 'list'>, {'human_actor': [0, 1], 'actor': [2, 3]})
Wth, and how can i fix it?