[Task] Add a way to store training parameters for envs

Opened so we can track and discuss @edbeeching's idea https://github.com/edbeeching/godot_rl_agents_examples/pull/19#issuecomment-1886444206:

Perhaps we can think of a better way to store the training parameters so that new users can have a rough idea of where to start.

I've thought about adding a copy of the used Python training script for envs that I make (most often I use the sb3 example, but potentially I may for some envs in the future use the cleanrl example as it supports onnx export too with the recent PR, or Rllib once I start experimenting with it).

However, copying the entire script means that if we make any changes to the examples in the repo, the copies will be outdated (but should still work and keeping them the same has the advantage that the results are less likely to change in case of some changes to the "default" example script). In case of the CleanRL example, most of the params can be adjusted by command line.

An instruction in the env's readme such as "modify these lines of the sb3 example" may work as an alternative, but requires more work from the user.

We could perhaps implement a .Yaml (or any other format) config system to store the hyperparameters for each example, and then the SB3 example can be modified to parse these and apply them (and potentially any other framework example as well if needed). It doesn't have to support all possible parameters initially, just the ones that are tweaked frequently. Then the envs could ship with some adjusted .yaml hyperparameters.

edbeeching / godot_rl_agents_examples

[Task] Add a way to store training parameters for envs #20