Adding different training mods for different Purposes

There are various situations in which you can train your AI. For example, there are many who want to train several genomes at the same time or some who do not know how to read out the necessary inputs and only want to define the outputs. Maybe you don't know what the fitness function should look like and want to evaluate genomes manually instead.

Several test environments are also required to ensure that it works as desired. Ping-Pong seems to be a relatively good and simple environment.

It would also be useful to be able to switch modes, for example if you want to let yane play against itself and then play against yane after training.

The following modes are currently only ideas, and you would have to test beforehand how useful they can be implemented.

[ ] Tournament mode: Yane trains against itself.
[ ] Single mode: In this case, exactly one instance is always trained. This can be used for training data sets if you don't want to worry about multithreading, or if you want to train yane yourself as a single person. Bear in mind, however, that this could take a relatively long time and that there are faster alternatives.
[ ] competitive mode: It's not really training, but more of a "I want to compete against the best genome" thing. You can switch to this mode after yane has been sufficiently trained and is then to be used. There is no training involved.
[ ] S mode: Yane with simulator mode: Useful if you can only train individual instances. It works in exactly the same way as in single mode, but there is another population in addition to the main population that is being trained. Namely the simulator population, which creates a simulated world for genomes. This means that genomes in the queue are assigned a pseudofitness in advance and genomes with a higher pseudofitness are prioritised for testing in the real environment. In addition, there should be no loss of performance for the most part, as the simulation part takes place in the background. The only problem is that the simulator takes a relatively long time to train, because you only know whether the pseudofitness has been chosen well when this genome is properly tested. The simulator population therefore learns how to ensure that a genome has a pseudofitness as similar as possible to the original.
[ ] OSAE mode: Yane with observer, simulator, and evaluator mode: Similar to S mode, but the fitness is manually determined by the human. This means that while a genome is being trained, the human can tell how good it was. In addition, a third population is also created, which learns to evaluate genomes as a human would evaluate them. Useful if you don't know how to set the fitness function. Might look like that, a genome gets a good, neutral or bad for each output and the human can always press 3 set buttons in the process to say whatever that genome did was good or bad. The evaluator is then trained over time and the human has to participate less and less in the evaluation.

Resch-Said / yane

Adding different training mods for different Purposes #26