Bypass Godot physics timing for fast-as-possible training

edbeeching / godot_rl_agents

An Open Source package that allows video game creators, AI researchers and hobbyists the opportunity to learn complex behaviors for their Non Player Characters or agents

MIT License

942 stars 69 forks source link

Bypass Godot physics timing for fast-as-possible training #41

Closed jtatusko closed 1 year ago

jtatusko commented 1 year ago

I think it is possible to train faster by running physics in _process(delta) instead of _process_physics(delta) at training time. According to the docs: "Idle processing allows you to run code that updates a node every frame, as often as possible." We would also have to use a hardcoded delta of 1/60. Is this something you have already considered? I'm happy to investigate further if useful

Thanks for building and sharing this repo by the way. I'm looking forward to training some agents.

Cheers

edbeeching commented 1 year ago

Thanks, I have considered something similar. You can pass an option speedup=8 that will speed up the env by a factor of 8. Apparently 8 is the maximum due to some internal restrictions. I have not really tested it though, see more details here.

The reason I did not use _process is that some observations require raycasts and it is recommended that this is run in the physics loop. Also, we want a consistent delta, which is not the case with the delta from _process.

I am interested to hear more ideas about this as there are definitely some improvements to be made.

jtatusko commented 1 year ago

FlyBy

Here are the steps per second I am getting as a function of the speedup parameter for a single environment. Looks like my CPU tops out around 145.

Here's the script I used to generate:

def interactive(speedup: int):
    env = GodotEnv(env_path="../FlyBy.x86_64", speedup=speedup)
    obs = env.reset()
    t0 = time.perf_counter()
    for i in range(1000):
        action = [env.action_space.sample() for _ in range(env.num_envs)]
        action = list(zip(*action))

        obs, reward, term, trunc, info = env.step(action)
    t1 = time.perf_counter()
    step_per_s = 1000 / (t1 - t0)
    env.close()
    return step_per_s

if __name__ == "__main__":
    import matplotlib.pyplot as plt

    xs = []
    ys = []
    for i in range(30):
        xs.append(i + 1)
        ys.append(interactive(i + 1))

    plt.scatter(xs, ys, )
    plt.xlabel("Speedup")
    plt.ylabel("Steps per Second")
    plt.show()

This is running on my laptop. Since there are 15 agents, 15 * 145 = 2175 steps / s. Pretty good! I'm doing some more testing tomorrow.

edbeeching commented 1 year ago

Thanks for doing this benchmark. Not so bad, there is also a frame skip which plays a role, I think the default is 8. Perhaps try increasing the number of agents as well?

jtatusko commented 1 year ago

No problem.

On further testing, I think the speedup argument is actually unnecessary. --fixed-fps will run physics as fast as the CPU can process it. This is done here in Godot. The issue is that the --fixed-fps argument is not being parsed correctly.

Changing

        if framerate is not None:
            launch_cmd += f" --fixed-fps={framerate}"

        if framerate is not None:
            launch_cmd += f" --fixed-fps {framerate}"

fixes the issue (no space).

I'm benchmarking steps per second as a function of the number of agents next.

jtatusko commented 1 year ago

Environment steps per second = Gym environment steps Agent steps per second = Gym environment steps * Number of agents

agent_sps

I used --disable-render-loop --headless --fixed-fps 60 to generate these results. Looks promising. I'm going to implement a replay system to test that real-time and hyper real-time generate the same states from a given action sequence.

edbeeching commented 1 year ago

This is great work, thank for this analysis. I always wondered why --fixed-fps was not working but I never to the time to debug it. Let me know how it goes and feel free to make a PR for the --fixed-fps fix.