Have me in your project？

chauvinfish commented 6 years ago

I am a university student in China.I am also interested in ML. I am not very experienced, it's my first time play with pysc2.I do get some basic concepts and debugging skills……

DevonCurrent commented 6 years ago

Yes, I am happy to have help on this project and I have invited you as a collaborator. The code I have so far is mostly from [https://github.com/skjb/pysc2-tutorial]. I used the Zerg tutorial (which is the most up-to-date example of the API) and included the Q-learning model that he provided from the sparse learning. I would recommend checking it out, as he includes tutorials explaining what he is doing step-by-step. Also, I'm happy to answer any questions you might have.

This is my first experience with ML (and I am still fairly new to github and programming in general). So, if there is anything I do that you notice is wrong, feel free to correct me!

chauvinfish commented 6 years ago

I am developing a zerg agent based on skjb's refined_phrase_agent. I choose zerg because zerg depend less on building position(p and t need to block pass with buildings) and zerg can produce a great many unit in one time.

chauvinfish commented 6 years ago

I would be appreaciated if you can give me your contact address. For privacy consideration, you can send your email address or telegram id to my email address 'chauvinfish@gmail.com'

chauvinfish commented 6 years ago

i have seen you script. it seems that you abandoned the 'hot square' and 'green square' states in current_states. I don't think the way the original author handles it is very high. But we need to figure out a way to add enemy possition into current_states or our agent will always be blind and deaf

chauvinfish commented 6 years ago

And our agent need to find the correct possition for second base. It's hard to locate the central possition of a mineral field with .mean() since you will get the coordinates of mineral fields in the whole map

chauvinfish commented 6 years ago

The current bot is still a scripted bot rather than a true ai. I am spending whole day writing "smart action" description, interpreting actions into underlying operation. A true ai should output underlying operation directly if it can understand how the game works.

chauvinfish commented 6 years ago

It seems that my zerg agent only learnt to create as many as overlord as possible, :)

DevonCurrent commented 6 years ago

I'm not sure I understand the hot_square and green_square not being in the current_states. I have this code for the hot squares (and another section for the green): for i in range(0, 4): current_state[i + 4] = hot_squares[i] Does this not work? I do agree that this could be handled better so that the agent has less states.

DevonCurrent commented 6 years ago

I've been trying to work on the agent finding the second base... I think I can have it check the minimap and look for the closest minerals nearby, and then move the camera to those mineral locations.

DevonCurrent commented 6 years ago

Do you mean that the bot is scripted since it has the 'sparse_agent_data' file? I'm not sure I understand how the agent would be able to learn through reinforcement without keeping track of a Q-learning table to save all of the states, actions, and rewards.

Also, is there any ML books (in English) that you would recommend that I read for this project?

chauvinfish commented 6 years ago

I called it scripted bot because its RL module cannot output agent actions like

1/move_camera (1/minimap [64, 64]) 2/select_point (6/select_point_act [4]; 0/screen [84, 84]) 3/select_rect (7/select_add [2]; 0/screen [84, 84]; 2/screen2 [84, 84]) 4/select_control_group (4/control_group_act [5]; 5/control_group_id [10]) 5/select_unit (8/select_unit_act [4]; 9/select_unit_id [500]) 6/select_idle_worker (10/select_worker [4]) 7/select_army (7/select_add [2]) 8/select_warp_gates (7/select_add [2]) 9/select_larva () 10/unload (12/unload_id [500]) 11/build_queue (11/build_queue_id [10]) 12/Attack_screen (3/queued [2]; 0/screen [84, 84]) 13/Attack_minimap (3/queued [2]; 1/minimap [64, 64]) 14/Attack_Attack_screen (3/queued [2]; 0/screen [84, 84]) 15/Attack_Attack_minimap (3/queued [2]; 1/minimap [64, 64]) 16/Attack_AttackBuilding_screen (3/queued [2]; 0/screen [84, 84]) 17/Attack_AttackBuilding_minimap (3/queued [2]; 1/minimap [64, 64]) 18/Attack_Redirect_screen (3/queued [2]; 0/screen [84, 84]) 19/Scan_Move_screen (3/queued [2]; 0/screen [84, 84]) 20/Scan_Move_minimap (3/queued [2]; 1/minimap [64, 64]).

It depend human labor to interpret 'smart action' into agent action

smart_actions = [ ACTION_DO_NOTHING, ACTION_SELECT_PROBE, ACTION_BUILD_PYLON, ACTION_BUILD_GATEWAY, ACTION_SELECT_GATEWAY, ACTION_BUILD_ZEALOT, ACTION_SELECT_ARMY, ACTION_ATTACK, ]

chauvinfish commented 6 years ago

There is no RL book for sc2 currently. Because not many people are using pysc2 as deeplearning environment.

chauvinfish commented 6 years ago

If there is a RL book in Chinese on a unpopular subject, it must be translated into Chinese.

DevonCurrent commented 6 years ago

Oh ok, I think I understand. So would something like this be better to replace all of the elif statements at the end? smart_action = random.choice(actions.FUNCTIONS); if self.can_do(obs, smart_action.id): actions.smart_action This would seem to pass all the possible functions, so then we would just need to add the correct arguments to each choice.

DevonCurrent commented 6 years ago

yeah, I figured there was nothing for sc2. I was just wondering if there was any books in general that you would recommend. I do have a couple books that go over RL for python that I will probably start looking through.

chauvinfish commented 6 years ago

sorry,i cant recommend any because my books are all written in chinese.

i successfully implement a RL agent using full conv network and a3c algorithm. https://github.com/xhujoy/pysc2-agents the author is a chinese too. I can contact him. he stopped working on it after open source. so the project can't run with the newest pysc2. but i modified it and make it run

chauvinfish commented 6 years ago

this agent can mimic real human player, output direct actions. So far,such agent can only play some minigame like move_to_beacon,deafet_roach.It's incapable of playing duel games.However it's a more promising future then scripted bots. people expect agent to play sc2 game rather then print out a '16 marine drop' timetable

chauvinfish commented 6 years ago

Still, it's incredible if a A.I. learnt to print a '16 marine drop' timetable

chauvinfish commented 6 years ago

I suggest the sc2 has two game logic. Marco logic：like choose orcale harassment or 4 gateway all-in.Mirco logic: how to control groups and single unit, cast spells etc. A agent should learn both.

those elif statements present the game logic, if you replace them with random.choice(),the agent will become much stupid

chauvinfish commented 6 years ago

I fork the project and uploaded my modified code. you can download and have fun https://github.com/chauvinfish/pysc2-agents

chauvinfish commented 6 years ago

https://github.com/chauvinfish/pysc2-agents

DevonCurrent / ProtossAgent-RL