Minigame map issues (MAP DESIGN) and Sentry Defense explotation (THEORY that affects MAP DESIGN)

SoyGema commented 6 years ago

Hi there! I am working in a minimap for sentryDefense function explotation and I am having problems with triggers init configuration .

Are you a experienced game map creator that could help me with this? I am basing this design in DefeatRoaches minimap -- Triggers init visual script -- Balance melee parameters calculation questions -- Reward change
https://github.com/SoyGema/Startcraft_pysc2_minigames/tree/master/new_minigames/SentryDefense

Can anyone with DRL theoretical expertise give me guidelines with this?

-- Should Q-network implementation be focus in sentry unit or shall the actions be just defined and let agent learn it ? --Shall the map be unbalanced for Protoss unit in order to force sentry units to use abilities like force field , hallucination and guardian field? --Would it be useful to produce more minimaps with the same goal but with different environments?

--Shall minimap designers focused on RL research build more general melee maps in order to not produce subgoals that might affect generalization?

Please, if I do have a significant missunderstand concepts let me know. I am basing this ideas on this readings

A brief Survey of Deep Reinforcement Learning .
-Reinforcement Learning: An Introduction . -pysc2 deepmind paper

Thanks for creating this amazing RL environment!!^^

simonMoisselin commented 6 years ago

I think solving mini-game where you have only 1 sentry units, and the goal is to launch a force field "closed" to a given position is already a good challenge. ( more you are closed, the more the reward is). We can restrict the action space, such that not moving the camera etc..

I can design this minigame if you want. Th difficult part will be to implement the algorithm !

I really think that your map is to complex for actual state of the art RL algorithms.

SoyGema commented 6 years ago

Ey! thanks for the comments and honest feedback ^^ I understand and somehow agree with your statement.

Assume then that your answer is that Isolating sentry functions in different minimaps is the correct option for this goal .

A 1vs1 imbalanced melee would be good for hallucination? 2 SentryVs 5 Zerings melee.

Could both of us work on this - minimap - ? would love to collaborate ! ^^

SoyGema commented 6 years ago

Ey @Entruv! Hope everything is going awesome . I´ve carefully give a thought about the ongoing converstation about sentry unit

In RL, the set of actions must be defined in order to compute the probability of choose that given actions. If sentry ability is only set to guardian field, you don´t give the opportunity to choose in between force field or hallucination. So in sentry actions I should include both hallucination and guardian field,

As far as I can calculate, the total number of actions would be 14 including all hallucinations https://github.com/SoyGema/Startcraft_pysc2_minigames/blob/master/Agents/SentinelDefense.py

Another option is to set a random hallucination in order to options simplifications. In that case actions would be reduce to 6 . Does that level of complexity work for you? In Atari game I´ve seen agents deal with 4/5 actions

OriolVinyals commented 6 years ago

This is a perfect example of what a new mini game could look like. Creating an imbalanced situation which forces uses of hallucination seems like a great start. Making a special action to select a random hallucination is a good idea, but most current RL algos should be able to deal with deciding which is best to win that combat.

If you create the minimap I'm happy to give it a try using our codebase : )

simonMoisselin commented 6 years ago

Hello @SoyGema !

At first I was thinking of creating a complicated reward function depending of where you launch the force field. But I think we need to keep simple rewards function.

I will read your code on the sentry agent later when I have time.

I created the minigame of 2 sentry vs 5 zerglings (https://github.com/entruv/minigames_pysc2). Do not hesitate to test it and give ideas for improvements.

Very curious to see the agent's behavior trained from the codebase of @OriolVinyals :)

SoyGema commented 6 years ago

ey @entruv ! As I see issues in the map, I am going to open in your repo a issue I encourage you to read https://guides.github.com

SoyGema commented 6 years ago

Find iteration of ForceField in https://github.com/SoyGema/Startcraft_pysc2_minigames/tree/master/new_minigames/SentryForceField and in a PullRequest to your repository.

With that in mind, I´m moving foward to hallucination map .

OriolVinyals commented 6 years ago

Hi,

Since you seem to be doing pretty cool stuff with the PySC2, I'd recommend checking out and applying to come work with us at Blizzcon in LA (Nov 3/4). Spots limited, and some travel funds available!

Info: http://us.battle.net/sc2/en/blog/21048078/announcing-the-starcraft-ii-ai-workshop-10-4-2017

GL&HF, Oriol

SoyGema commented 6 years ago

Greetings, I am posting first iteration of Hallucination sentry map -HallucinIce - working in the imbalance melee taking into account both protection and damage produced by units and Terran unit selection being strong against sentry according to http://starcraft.wikia.com/wiki/Sentry .

With a second iteration of this I will go into an agent construction. According to game mechanics better hallucination unit would be Stalker . https://github.com/SoyGema/Startcraft_pysc2_minigames/tree/master/new_minigames/SentryHallucination

SoyGema commented 6 years ago

Ey there! I attach some anserws I've found and some tips

Tips for designing new mini-maps

Even if it is a mini-game design towards a goal : whether it is to study a given unit or reach a production milestone.
You can previously study given mini-maps and change melee taking into account unit balance in sources like battlement forums like battlenet or liquipedia taking into account damage per unit, speed and imbalance Starcraft II gameplay balancing to settle down the melee or production milestone
The triggers section will have the most design charge . This visual scripted interface will be better understood with an example. I highly recommend DefeatRoaches as an start
Make the map test with random agent part of the design . Probably Restart map trigger and playable size conditions may change ( among other things)
Pay special attention to the triggers : Init (for unit map initialization) ScoreUpdatesandVictory (set rewards) , Defeat and ResetMap ( for after win-loose mini game situation)

Questions regarding theory

-- Should Q-network implementation be focus in sentry unit or shall the actions be just defined and let agent learn it ? I recommend this repository for learning about Q-learning method and implementations https://github.com/dennybritz/reinforcement-learning and cris-cris pysc2 agent implementation for example construction BEFORE constructing an agent, or in parallel environment construction . So far I've started with custom scripted_agent like in pysc2 repo For now there is a common net architectures defined that might be combined with certain actions for focusing on certain goals

--Shall the map be unbalanced for Protoss unit in order to force sentry units to use abilities like force field , hallucination and guardian field? Yes, imbalance by design is a good start in order to learn certain functions in a melee situation.

--Would it be useful to produce more minimaps with the same goal but with different environments? I am actually developing an agent that might be used in ForceField and HallucinIce map , so at least now for me it worths the try as both mini maps are focus on sentry unit . Some test drove the agent to perform hallucination instead forcefield https://github.com/SoyGema/Startcraft_pysc2_minigames

--Shall minimap designers focused on RL research build more general melee maps in order to not produce subgoals that might affect generalization? It depends on the goal. In this case, there is a special attention on sentry unit, as it might have a significant value in learning and competing against other players in short, mid and long term : In short term, the unit produces defense in forcefield or guardian field for a army defense, but in the mid-long term hallucination might lead to trick the opponent into another unit strategy construction.

With all this on mind, if there is any misunderstood please let me know . This issue might be ready to be closed

google-deepmind / pysc2

Minigame map issues (MAP DESIGN) and Sentry Defense explotation (THEORY that affects MAP DESIGN) #80

Tips for designing new mini-maps

Questions regarding theory