uzh-rpg / agile_flight

Developing and Comparing Vision-based Algorithms for Vision-based Agile Flight
MIT License
137 stars 52 forks source link

Data interpretation logged by tensorboard_log #47

Closed HarukiKozukapenguin closed 2 years ago

HarukiKozukapenguin commented 2 years ago

Thank you for interesting simulator!

I checked run_vision_ppo.py by following command. python3 -m python.run_vision_ppo --render 0 --train 1 And, I found data when they train in the envtest/python/saved directory (e.g. PPO_1, PPO_2). I found some policies when they're training(/policy), and Test Trajectory(/TestTraj). The questions I would like to ask is as follows.

  1. where is the logging of reward transition when they train?
  2. what does each axis means in the graph of TestTraj/Plots?
  3. which code define the parameter of plotting or logging?
yun-long commented 2 years ago

hi,

1) the training reward is logged in tensorboard. you can to to saved and run

tensorboard --logdir=./

2) the plots in the first row are position [x, y, z] and the plots in the second row are velocity [x, y, z]

3) the plotting is done here

HarukiKozukapenguin commented 2 years ago
  1. Which directory should I run this command (tensorboard --logdir=./), or should I run this command b/f I run python3 -m python.run_vision_ppo --render 0 --train 1?
HarukiKozukapenguin commented 2 years ago

I move to a directory of saved/PPO_(num) and I run tensorboard --logdir=./ then I can seer transition of each reward. Thank you!

HarukiKozukapenguin commented 2 years ago

@yun-long

I have one question about interpretation of TensorBoard. I can see rewards transition when I learns in the simulation, but I do not know how what does it means.

  1. I think the reward is a sum of reward of each episode, Is it correct?
  2. Is this rewards a rewards when agent trains? or rewards when agent evaluate?
yun-long commented 2 years ago

Hi,

the reward you see on Tensorboard is from here.

In summary, contains the sum reward and each individual reward component. The reward is a training reward, not an evaluation reward.

HarukiKozukapenguin commented 2 years ago

@yun-long Which code writes these rewards to TensorBoard?

yun-long commented 2 years ago

hi

the code is here

HarukiKozukapenguin commented 2 years ago

Thank you!