[ROTKR-65] AI neural network control

hatfield-c commented 4 years ago

There will need to be an architecture to handle the loading and management of AI neural networks for use on enemy ships.

The architecture should be reusable on multiple ships, and each ship should maintain a list of [neural network, behavior name/key] pairs, such that each desired behavior from the ship has an associated neural network. This also necessarily means that there must be a way to swap between the neural networks being used by the enemy, such that we don't waste processing power by running them all at all times.

Some basic behaviors are:

Patrol
Chase/Combat

hatfield-c commented 4 years ago

Each ActorShipManager will have a list of "Brains", one for each difficulty.

Each "Brain" will have a list of "Behaviors" we want the ship to exhibit, ranging from "Patrol" to "Combat".

Each "Behavior" will be composed of the following:

string Name : The name of the behavior
NNModel NeuralNetwork : The neural network used to make decisions
InferenceDevice InferenceDevice : Which device should decisions be made on (i.e. GPU or CPU)

All of these things should be able to be set up in the inspector. Then at runtime, the director will tell an instantiated ActorShipManager what difficulty to use, which will in turn pass the appropriate Brain to the ShipActor, which will then be responsible for managing behaviours.

hatfield-c commented 4 years ago

Note - the InferenceDevice field may or may not be serializeable. If it is not, then we can use a boolean titled "GPU" instead. If it is on, then decisions are made on the GPU. If it is off, then the CPU is used.

hatfield-c commented 4 years ago

Enemy ships now select their neural network based on the given difficulty they are instantiated with.

hatfield-c commented 4 years ago

The training environments for our behaviors will need to be set up such that when the agent collides with the terrain a penalty will be applied to the agent. This necessitates the creation of a "agent_terrain" script to be applied to terrain.

We should begin by training a neural network to patrol by itself, in search of a target. Once that NN has been trained effectively, we can then reuse it, and have it trained with multiple enemy ships.

hatfield-c commented 4 years ago

A basic training scene + architecture has been set up. It will be likely that this architecture needs to be refined, but it will serve as a good base for now.

hatfield-c commented 4 years ago

This issue is ready for QA. To QA this issue:

Verify NN architecture
1. Open up the poki_Agent prefab, under the Assets/Prefabs/Actors/Ships directory
2. Verify that it has a list of "brains" in its "ShipManager" component, and each brain has an entry for a name, a difficulty level, and two NN (one titled "patrol" the other titled "combat")
Check training scene
1. Open up the scene Assets/Scenes/Training/ShipTraining
2. Run the scene
3. Verify that the poki_Agent prefab is spawned in the scene
4. Verify that the TargetPrefab object is spawned in the scene
5. Verify that the TargetPrefab turns itself towards the "Destination" prefab, and moves itself toward the "Destination" prefab when within 30 degree of it.
6. Verify that the "Destination" prefab chooses a new location once the TargetPrefab object comes close to it
7. Verify that the TargetPrefab orients itself and moves toward the new Destination location
8. If you notice the TargetPrefab randomly changing direction, slowing down suddenly, or missing the destination by a wide margin, this is intended behavior. This will help train the neural network to generalize its behavior, even when the player behaves seemingly irrationally or does something stupid.
9. Repeat this process, but with the Scale of the "CollectionChamberWater"'s material parameter set to 10.

hatfield-c / rotkr

[ROTKR-65] AI neural network control #65