xbpeng / DeepTerrainRL

terrain-adaptive locomotion skills using deep reinforcement learning
GNU Lesser General Public License v3.0
432 stars 129 forks source link

Run ./TerrainRL_Optimizer -arg_file= args/dog_slopes_mixed_args.txt command #32

Open wenyijiang opened 7 years ago

wenyijiang commented 7 years ago

Hi ! When I run ./TerrainRL_Optimizer -arg_file= args/dog_slopes_mixed_args.txt command, my computer is computing something. It's about Episodes、Cycles、Avg dist. I want to know what are these about? Thanks a lot ! Best wishes !

xbpeng commented 7 years ago

that arg file is mainly intended to be used by the TerrainRL.exe app. It doesn't do any training and just runs a policy. You should use the args with "train" in the name for training.

wenyijiang commented 7 years ago

I want to train .When I run ./TerrainRL_Optimizer -arg_file= args/opt_args_train_mace.txt command, it shows :

[libprotobuf ERROR google/protobuf/text_format.cc:274] Error parsing text-format caffe.NetParameter: 14:19: Message type "caffe.MemoryDataParameter" has no field named "label_size". F0522 11:48:02.638092 14423 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: data/policies/dog/nets/dog_mace3_train.prototxt

could you please tell me what the problem it is in the execution of training? Except training, the program runs normally. Thanks a lot !

nwcora commented 7 years ago

Hello, i want to know when i run the ./TerrainRL_Optimizer -arg_file= args/dog_slopes_mixed_args.txt the terminal displays like this Action: 2 Val: (0.993) 0.989 0.961 0.993
so what does those parameters mean ? and if it is running on a policy ,why there are not any scenes of the simulation just like when i run the ./TerrainRL -arg_file= args/sim_dog_args.txt.how can i see the simulation scenes?

xbpeng commented 7 years ago

Those parameters are outputting the values of each critic, and shows which actor was selected. In this case, Actor 2 is selected since it's corresponding critic has the highest value. TerrainRL_Optimizer, does not support any rendering, since it is mainly used for offline training. TerrainRL is the app for visualizing the policies.

nwcora commented 7 years ago

i have run the TerrainRL -dog_slopes_mixed.txt ,so there is not any training process? and it's just running a policy?where is the policy network?what‘s the difference between these two files?dog_slopes_mixed_args.txt dog_slopes_mixed.txt

xbpeng commented 7 years ago

Yes that i jut running a policy, for training, uses the opt_argstrain*.txt files. You can find which policy those files are running by looking at the path specified by "-policy_model=".

nwcora commented 7 years ago

Yeah,you are right,i thought the policy must be what you say,but i dont understand the policy file which is .h5 file ,what does those binaries mean? in my opinion,the policy is something like target joints,etc.so i am really confused.maybe they are the network parameters?

xbpeng commented 7 years ago

the policy files contain the network weights, the output actions from the policies consist of joint angle and other parameters for the FSM.

nwcora commented 7 years ago

I see ,but in the args/sim_dog_args.txt .there isn't any policy,it seems the control is just the default output action,right?In the dog_character file,there are three controllers:fast run,slow run and jump,i guess they are used for forming the initial actions.what do the dog_motion file and state_file representate? the motion file is 24-dimension and forms a loop,maybe the action loop?.and the state_file is 23-d on both pose and velocity,i am not sure what the numbers stands for .

xbpeng commented 7 years ago

yes, args/sim_dog_args.txt doesn't have a policy, it is just running the default FSM. Yes, the dog_character file specifies the initial actions for the fast run, slow run, and jump. The motion file were used for some other things we were working on, so they are not relevant. The state file represents the initial state of the character at the start of each episode (pose and velocity).