GAIL-4-BARK / bark-ml

Machine Learning Applied to Autonomous Driving
MIT License
0 stars 0 forks source link

Changed the tf2rl implementation to return trajectories during evaluation and added avg step count #56

Closed MarcelBruckner closed 4 years ago

MarcelBruckner commented 4 years ago

I wanted to implement the comparison of the expert trajectories and agent trajectories. While searching for a way on how to get the actual agent trajectories during evaluation I found that there is no way to get them out of tf2rl during evaluation.

So I forked tf2rl and implemented some small changes so that the evaluate_policy function now returns:


You can review the changes to tf2rl and see that it is only a minor change to the evaluate_policy function.
It now saves the generated trajectory from the replay buffer instead of clearing the replay buffer after every episode.

Finally the average step count is calculated and the value written to tensorboard.


What we now can do is: