edbeeching / godot_rl_agents

An Open Source package that allows video game creators, AI researchers and hobbyists the opportunity to learn complex behaviors for their Non Player Characters or agents
MIT License
942 stars 69 forks source link

Add sb3 sac onnx export #198

Closed Ivan-267 closed 1 month ago

Ivan-267 commented 1 month ago

This adds support for exporting SAC models with SB3. Support is added to the onnx export code.

To use, the model in SB3 example needs to be changed from PPO to SAC (for load as well if resume functionality is used), and single obs env should be used with MlpPolicy, e.g. as below:

from godot_rl.wrappers.sbg_single_obs_wrapper import SBGSingleObsEnv
def handle_onnx_export():
    if args.onnx_export_path is not None:
        path_onnx = pathlib.Path(args.onnx_export_path).with_suffix(".onnx")
        print("Exporting onnx to: " + os.path.abspath(path_onnx))
        export_model_as_onnx(model, str(path_onnx), True) # Set to True when using single obs env with MlpPolicy (for both PPO and SAC)
env = SBGSingleObsEnv(
    env_path=args.env_path, show_window=args.viz, seed=args.seed, n_parallel=args.n_parallel, speedup=args.speedup
)
    model: SAC = SAC(
        "MlpPolicy",
        env,
        tensorboard_log=args.experiment_dir,
    )

Example video of running a trained onnx model on BallChase env, trained using SAC for ~1.7 hrs with 2 parallel envs and settings as below (the reward was also slightly changed):

    model: SAC = SAC(
        "MlpPolicy",
        env,
        gradient_steps=32,
        train_freq=32,
        verbose=2,
        learning_starts=250_000,
        tensorboard_log=args.experiment_dir,
    )

https://github.com/user-attachments/assets/de390e73-cef2-403d-8673-7614e4bcab5f

Ivan-267 commented 1 month ago

There is still some code polishing to do, will push an update once it's done.