Enhanced smart brain - Githubissues

Kavignon commented 7 years ago

In its actual state, the advanced brain uses q-learning with two learning rates, alpha which enables the brain to make decision based on whether a strategy was in memory or not. The epsilon learning rate lets the brain to continuously explore the environment and check out other strategies than those in memory. This way, maybe a strategy that may have been bad in the past can be better now, and the brain continues to learn on its own.

Right now, the brain only manages a single character with a move range of 1. In the RPG, the brain will have to make decision for a whole team composed of 5 to 8 characters, depending of the level the player's on. So the brain must now have a third learning rate which will help it see whether or not the execution of a team member was actually good or not for the team and adjust its thinking. When the level starts, the brain will know what kind of characters the player has and adjust to it, and assign each team member a target that will be quick to eliminate based on personal stats and tactical advantages. For learning purposes, the brain should be tested when every team member converge on as few possible targets at the same time and when converging on their assigned targets and see which strategy is the most effective.

[ ] Add new learning rate for team's decision ( Mu)
[ ] Add new learning rate to see if it's better to attack few targets at the same time or attack assigned targets ( Nu)
[ ] Handle a team and not a single character
[ ] Cross-checked the best values for : Alpha, Epsilon, Mu and Nu

Kavignon commented 7 years ago

The enhanced brain must be able to use pathfinding to find the best path to find the target.

Kavignon commented 7 years ago

If there was enough time between now and the December the 9th, it would be pretty darn cool add a last learning rate for the brain. That learning rate would serve to see when a character is move safely using a heatmap that would represent either positive values for chests, team members and targets and, negative values when moving closer to a tactical disadvantage such as a bad job combination with a human character or a trap that would be insight. This way, the brain would adapt its pathfinding to move quickly but also as safely as possible towards a target.

GameOfLightAndShadows / SmarTac

Enhanced smart brain #83