Data interpretation logged by tensorboard_log

uzh-rpg / agile_flight

Developing and Comparing Vision-based Algorithms for Vision-based Agile Flight

MIT License

137 stars 52 forks source link

Data interpretation logged by tensorboard_log #47

Closed HarukiKozukapenguin closed 2 years ago

HarukiKozukapenguin commented 2 years ago

Thank you for interesting simulator!

I checked run_vision_ppo.py by following command. python3 -m python.run_vision_ppo --render 0 --train 1 And, I found data when they train in the envtest/python/saved directory (e.g. PPO_1, PPO_2). I found some policies when they're training(/policy), and Test Trajectory(/TestTraj). The questions I would like to ask is as follows.

where is the logging of reward transition when they train?
what does each axis means in the graph of TestTraj/Plots?
which code define the parameter of plotting or logging?

yun-long commented 2 years ago

hi,

1) the training reward is logged in tensorboard. you can to to saved and run

tensorboard --logdir=./

2) the plots in the first row are position [x, y, z] and the plots in the second row are velocity [x, y, z]

3) the plotting is done here

HarukiKozukapenguin commented 2 years ago

Which directory should I run this command (tensorboard --logdir=./), or should I run this command b/f I run python3 -m python.run_vision_ppo --render 0 --train 1?

HarukiKozukapenguin commented 2 years ago

I move to a directory of saved/PPO_(num) and I run tensorboard --logdir=./ then I can seer transition of each reward. Thank you!

HarukiKozukapenguin commented 2 years ago

@yun-long

I have one question about interpretation of TensorBoard. I can see rewards transition when I learns in the simulation, but I do not know how what does it means.

I think the reward is a sum of reward of each episode, Is it correct?
Is this rewards a rewards when agent trains? or rewards when agent evaluate?

yun-long commented 2 years ago

Hi,

the reward you see on Tensorboard is from here.

In summary, contains the sum reward and each individual reward component. The reward is a training reward, not an evaluation reward.

HarukiKozukapenguin commented 2 years ago

@yun-long Which code writes these rewards to TensorBoard?

yun-long commented 2 years ago

the code is here

HarukiKozukapenguin commented 2 years ago

Thank you!