Closed paehal closed 4 months ago
Very glad to hear you're finding Concordia useful!
Here are a couple things to keep in mind:
As for your question about whether it's best to use a python grounded variable for the coordinates versus a component like player_status, it really depends on what you want to do. If the coordinates are the critical part of the simulation, and you can't afford the possibility of hallucination with them, then it's better to store them in a python grounded variable and be very careful how you set their values. Ultimately the difference comes down to whether you get the value of the variable via a multiple choice question or via a free response question. Either way you can include as much reasoning or code as you like to ensure the answer is correct.
Thank you for your response. I appreciate you clarifying the key points I should be mindful of. I will also look into the updates related to the cyberball task in the future.
In response to your comment, I believe the answer is Yes. Specifically, I am interested in having two agents cooperate to pick up two balls, a task known as simple spread, which is a basic task in the MARL (Multi-Agent Reinforcement Learning) field. Therefore, I think it is crucial to accurately describe the position of the agents after they move as the next event.
If the coordinates are the critical part of the simulation, and you can't afford the possibility of hallucination with them, then it's better to store them in a python grounded variable and be very careful how you set their values.
I am currently undecided on whether to implement grounding for coordinate information or to add a component. (I am not yet at a level where I can make that decision and would appreciate some advice.)
If I choose to implement grounding for the coordinate information, should I integrate that implementation into the part of game_master.py
where the event_statement output is generated?
I am considering creating components like ball_status.py
, similar to the cyberball task, to organize information such as the current position of the agents. Would this approach be practical?
Additionally, regarding the implementation of option 2, I have added a new component to the definition of the game master, as shown below, but it seems that this module is not updated in env.step()
. Why might this be? Are there other points that need to be implemented?
# @title Create the game master object
env = game_master.GameMaster(
model=model,
memory=game_master_memory,
clock=clock,
players=players,
update_thought_chain=thought_chain,
components=[
instructions,
general_knowledge_of_premise,
important_facts,
rules_of_the_game,
relevant_events,
time_display,
player_status,
ball_status_component,
**my new component**,
convo_externality,
direct_effect_externality,
],
randomise_initiative=True,
player_observes_event=False,
players_act_simultaneously=False,
verbose=True,
)
I apologize for the inconvenience and thank you in advance for your response.
It sounds like for your use case it might be useful also to look at the election and inventory components. They do things a bit differently from ball_status. Its not totally clear which approach would be best for your case. If you do look at ball_status, make sure you have the latest version of the code since it changed quite recently, and I believe the previous version had a bug.
For grounded variables, they could be implemented by modifying the game master as you point out. However, that's not the recommended way. The way we have intended it to work is for all grounding to be implemented via components. That's what the inventory and elections components do. They ask a series of yes/no and multiple choice questions in order to set the values of grounded variables.
As for why your custom component is not updating, that sounds odd to me. If you are passing it there in the list of GM components then update should be getting called. Here is a link to the line where it happens. Maybe it is getting called but logs are getting swallowed by the multithreading? You could try replacing the multithreading inside the update_components function in the GM with the equivalent for loop calling component.update() on each component in self._components one at a time. That might make it easier to debug anyway.
@jzleibo
I've been experimenting with various approaches over the past week. I would like to clarify that the error I mentioned earlier was due to a mistake in my implementation. My apologies for any confusion caused.
Regarding your suggestion:
For grounded variables, they could be implemented by modifying the game master as you point out. However, that's not the recommended way. The way we have intended it to work is for all grounding to be implemented via components. That's what the inventory and elections components do. They ask a series of yes/no and multiple choice questions in order to set the values of grounded variables.
Following your advice, I created a new component to manage aspects like the position and orientation of balls and agents. This has allowed me to achieve the desired behavior to some extent.
However, this has led to some questions:
Partial Observations for Each Agent: Are partial observations considered for each agent? It is mentioned in the technical report that the Game Master (GM) returns observations relevant to each agent. Can the GM return results tailored to the state of each agent? For instance, if there is something that agent A is unaware of but agent B knows, would the event (observation
) resulting from A's action not presuppose the knowledge that B has?
Setting Agents' Objectives: Should the goal
variable in the player configuration be used to set the objectives for the agents?
I would greatly appreciate your feedback on these points.
I am currently working on a task using Concordia where two agents move around a two-dimensional XY coordinate space based on their chosen actions, to pick up a ball located in a room.
I am actually implementing this task based on the example of the cyberball task I've provided. However, I have a question regarding the output of the
event_statement
following anattempted_action
.In
game_master.py
, theevent_statement
is produced as the result of an action attempt:For this task, instead of using
chain_of_thought
for output, I want to output the physical movement results in terms of coordinates. (e.g., Agent 1 moved +5 in the x-coordinate and +2 in the y-coordinate, so the current coordinates of Agent 1 are (5,2)). The default output usingchain_of_thought
seems to trace the thought process of the agents, which I believe does not suit my task. Would it be better to implement this using a component likeball_status.py
that holds information about the ball, similar to the cyberball task?I am not very familiar with Concordia and would appreciate any advice.