Implement TRPO in Blender Game Engine tensorflow runtime

FTC-8856 / FTC-Robot-Controller

Over the years, we have mastered misusing hardware and software in the best way possible. Here you can see our coding progress to see both where we started and what wildly different place we ended up taking it.

1 stars 2 forks source link

Implement TRPO in Blender Game Engine tensorflow runtime #1

Open ohmahgawditbob opened 3 years ago

ohmahgawditbob commented 3 years ago

This will be an especially interesting task since i believe it was originally made for OpenAI gym sessions, which I do not think we should try to cobble together a data structure to spoof a Gym.

We will need to:

[ ] Strip down the original TRPO program for the stuff that we want
[ ] Intake Score, Output what values need to be changed This means we probably need to
[ ] Also input the state of the Neural Network
[ ] Use the TRPO output state as the new Neural Network state

This list is likely to change once discussion opens, bringing out all of the mindless things I had stated in the original To-do list.

brightly-salty commented 3 years ago

Really, TRPO only needs the following from a gym env:

obs_dim and act_dim (the shape of the observation space and action space)
to be able to reset the env (env.reset()
to be able to render the env if we need to animate (env.render())
to be able to step the env with an action generated by the policy (env.step(action.flatten()'

We just need to make some functions to replace these function calls.

ohmahgawditbob commented 3 years ago

Phew! https://github.com/LouisFoucard/gym-blender will allow me to not have to completely demolish the TRPO code. That was a close one.