Make AI less procedural

stevegrossi commented 6 years ago

Currently, the AI is very procedural: if X is true, then do Y, etc. It does an okay job at optimizing for itself, but doesn't attempt to play against other players. This lets other players run away with the game.

What I think would be more interesting/effective is to

build a way to assign an overall score to a game state
on an AI turn, loop over every possible action and compare the resulting state's score
take the move that results in the highest score
It may lead to a less-predictable AI, since the AI is working toward a goal (maximizing their score) not just making the same moves in response to the same situation
It can lead to easily-customizeable AI "dispositions". Simply by tweaking the value of a strong border or continents not controller by opponents, we can make a given AI more aggressive or devensive
Eventually, this can lead into machine learning where a neural network can learn—by playing against human opponents or itself—how to tweak these values to maximum effect.

This was an interesting paper on more-or-less this approach: http://ai.cs.unibas.ch/papers/theses/luetolf-bachelor-13.pdf

yakryder commented 6 years ago

I like it

yakryder commented 6 years ago

If we only look at the moves available in the current turn, I don't think there will be too many. Going deeper into the future would make the AI smarter, but could also be computationally infeasible.

So just as a creatively stimulating pipe dream -- I'm not proposing we feasibly do this -- to make calculating a ton of moves ahead more feasible, you could branch supervisors into timelines.

Let's say next move, the computer might attack territory 5 or territory 6. You start a supervising process to create the possible outcomes of the attack for territory 5 and another for the attack on territory 6. Each one of those is in turn spinning up child processes representing the possible outcomes of those decisions. Each child outcome process or group of child outcome processes has a sibling or parent evaluation process it reports to. Those evaluation processes pass only the best outcomes up the chain, and at the top level we have some criteria for evaluating picking the timeline with ideally the highest probability of the best state outcomes.

Anyway that might already be what you're thinking and I've only laid it out in broad strokes, but Elixir seems in principle extremely well suited for building AIs.

yakryder commented 6 years ago

Also, if the paper already says what I just said but better, I didn't read it yet. On the list :)

stevegrossi commented 6 years ago

So just as a creatively stimulating pipe dream

No, I was thinking the same thing! I think you're right that this is the kind of thing OTP would be perfect for. With the Task and Task.Supervisor modules we could pretty easily fan out potential-move-evaluation across multiple processes (taking advantage of concurrency) and have them report back and—who knows—maybe looking a couple moves ahead won't take eons and we'd have a pretty smart AI on our hands. I can't say I think more than a move or two ahead myself 😁

stevegrossi commented 6 years ago

I think the paper only looked at one level of the currently possible moves, but then again the author wasn't using Elixir 🎉

stevegrossi commented 6 years ago

For reference, these are the "features" (things the AI optimizes for/against) from the paper:

Total number of countries owned
Number of starting armies
Number of unique enemy neighbours
Pairs of friendly neighbours
per continent: Number of countries in a specific continent
Number of countries in largest cluster
Number of troops in largest cluster
Bordering troops in each country
Total troops
Number of cards (does't apply to Sengoku, at least not yet)

Additional notes:

When determining the value of attacking, the AI may need to consider a series of battles as a whole (which can result in conquering a valuable tile), not just the value of each individual attack, most of which will result in merely defeating a unit or two of an opponent
Some values (e.g. border troops), they should become less valuable as they increase. For instance, the value of an additional unit on a border should be much higher when you're outnumbered than when you outnumber any opponents.
But for other values (e.g. percent of a continent owned), they should become more valuable as they increase. The value of owning an additional tile in North America should be much higher when you already own all but one of its tiles, then when you only own a couple.

stevegrossi / sengoku

Make AI less procedural #15