histmeisah / Large-Language-Models-play-StarCraftII

TextStarCraft2,a pure language env which support llms play starcraft2
190 stars 13 forks source link

questions about textsc2 env #14

Closed hjh0119 closed 1 month ago

hjh0119 commented 1 month ago

Awesome work!

I have reviewed the paper and have several questions that I hope can be addressed.

  1. What is the difference between textsc2 and python-sc2? The paper mentions the Observation-to-Text Adapter and the Text-to-Action Adapter, but there is no detailed explanation provided.

  2. What is the difference between a real-time agent and a non-real-time agent? I reviewed the code, and it seems that the distinction lies in whether the real-time agent constructs an L2 prompt. If this is the case, does the paper's claim that the LLM agent can achieve the skill level of a gold-ranked human player refer to the real-time agent?

  3. Following up on the second point, are the experimental results in the paper based on the non-real-time agent?

histmeisah commented 1 month ago

Thank you for your review and questions. Below are the answers to your queries:

  1. The python-sc2 library provides a foundational codebase along with several examples that assisted in building this framework. Thus, Text-sc2 can be considered an extended version of python-sc2. I also received guidance from Burnysc2, the author of python-sc2, who possesses extensive knowledge in this area.

  2. The Observation-to-Text Adapter converts game metadata into macro information typically used by both language models and humans.

The Text-to-Action Adapter similarly transforms these language descriptions into specific game actions.

O2T Process: GameData -> metadata -> language information StarCraft2 -> python-sc2 -> TextSC2

T2A Process: Language description -> python script -> game operation TextSC2 -> python-sc2 -> StarCraft2

  1. There are no inherent differences between these two agents, except for the different parameters we used for real-time settings. If set to True, the operation is in real time.

As mentioned in our paper, during the Human vs. LLM Agent tests, we utilized a finetuned model similar to models like qwen2-7b and other open-source LLMs, and we have also released these datasets and models. Thus, the human tests were conducted under real-time conditions.

  1. The gold rank in StarCraft II varies significantly in skill level. Some players perform poorly, while others play reasonably well.
  2. someone has improved my method by enhancing prompt and expert knowledge: video link: https://www.bilibili.com/video/BV1uz42187EF/?vd_source=0553fe84b5ad759606360b9f2e687a01#reply226628737648 github link:https://github.com/luchang1113/HEP-LLM-play-StarCraftII
hjh0119 commented 1 month ago

Thank you for your detailed response