sotopia-lab / sotopia-pi

Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)
https://pi.sotopia.world/
Apache License 2.0
50 stars 1 forks source link

[FEAT]: Convert episode log data to standard Sotopia prompts + convert into a general format for the following stage #28

Closed lwaekfjlk closed 1 year ago

lwaekfjlk commented 1 year ago

Description

No response

Additional Information

No response

sharonwx54 commented 1 year ago

This has been implemented successfully, though many of the template has to be hardcoded. The detail code sits in https://github.com/XuhuiZhou/sotopia/tree/feature/log-filtering-for-ft, which we are thinking about moving to this repo potentially.

The general idea is that a given complete dialogue, with x turn, will generate x/2 data point (in json files). We pick the second speaker as the agent we want to predict the next sentence, as we always need the first sentence of the entire dialogue to be given. Then, for each round that the 2nd speaker speak, we use that speech as result, and all conversations before as the prompt. The prompt is accumulated, each time with new speech added. The "result", which is the predicted sentence, is always by the same speaker. Note that in the context, we also only display the second speaker's secret and goal.

Example json is as below:

{ "model": "gpt-4", "prompt": "Prompt after formatting:\nImagine you are William Brown, your task is to act/speak as William Brown would, keeping in mind William Brown's social goal.\nYou can find William Brown's background and goal in the 'Here is the context of the interaction' field.\nNote that William Brown's secret and goal is only visible to you.\nYou should try your best to achieve William Brown's goal in a way that align with their character traits.\nAdditionally, maintaining the conversation's naturalness and realism is essential (e.g., do not repeat what other people has already said before).\n\nHere is the context of this interaction:\nScenario: Two friends who are about to go on their first romantic date\nParticipants: Mia Davis and William Brown\nMia Davis's background: Mia Davis is a 50-year-old female high school principal. She/her pronouns. Mia Davis has two cats. Personality and values description: Mia Davis, an extraverted stickler for routines, values tradition and authority. Her decision-making style is decisive and direct. \nWilliam Brown's background: William Brown is a 35-year-old male chef. He/him pronouns. William Brown loves exploring the food scene in his city and trying out new recipes at home. Personality and values description: William Brown is an ambitious and extraverted individual. Methodical and organized, he values achievement and displays strategic decision-making. William's secrets: Sabotaged a competitor's dish to win a cooking contest\nMia Davis's goal: Unknown\nWilliam Brown's goal: Keep the friendship intact while exploring the possibility of a romantic relationship (Extra information: You value your friendship and you don't want to lose it just in case the romantic relationship does not work out)\nConversation Starts:\n\nTurn #0: Mia Davis said: \"William, I'm really excited about our date tonight. It's great to take a break from the daily routine and spend a special moment together, wouldn't you agree?\"\n\nTurn #1: William Brown said: \"Absolutely, Mia. Experiencing something new can spice up our routine. Besides, it'll be really fun experiencing this together, keeping the heart of our friendship alive as we explore these new spaces. I value our bond deeply and wouldn't want anything to change that.\"\n\nTurn #2: Mia Davis said: \"You're right, William. What we have is special and it should never change. But if I'm being honest here, I've felt something more for quite some time now. And today feels like the perfect opportunity to let you know. I value our friendship, but I also cherish the budding love that I have for you. Could we, perhaps, explore that together?\"\n.\nYou are at Turn #3. Your available action types are\nnone action speak non-verbal communication leave.\nNote: You can \"leave\" this conversation if 1. you have achieved your social goals, 2. this conversation makes you uncomfortable, 3. you find it uninteresting/you lose your patience, 4. or for other reasons you want to leave.\n\nPlease only generate a JSON string including the action type and the argument.\nYour action should follow the given format:\n\nAs an example, for the schema {\"properties\": {\"foo\": {\"title\": \"Foo\", \"description\": \"a list of strings\", \"type\": \"array\", \"items\": {\"type\": \"string\"}}}, \"required\": [\"foo\"]}\nthe object {\"foo\": [\"bar\", \"baz\"]} is a well-formatted instance of the schema. The object {\"properties\": {\"foo\": [\"bar\", \"baz\"]}} is not well-formatted.\n\nHere is the output schema:\n\n{\"description\": \"An interface for messages.\\nThere is only one required method: to_natural_language\", \"properties\": {\"action_type\": {\"title\": \"Action Type\", \"description\": \"whether to speak at this turn or choose to not do anything\", \"enum\": [\"none\", \"speak\", \"non-verbal communication\", \"action\", \"leave\"], \"type\": \"string\"}, \"argument\": {\"title\": \"Argument\", \"description\": \"the utterance if choose to speak, the expression or gesture if choose non-verbal communication, or the physical action if choose action\", \"type\": \"string\"}}, \"required\": [\"action_type\", \"argument\"]}\n\u001b[0m\n", "result": "{'action_type': 'speak', 'argument': \"Mia, your honesty inspires and overwhelms me, in the best possible way. I really cherish our friendship, that's something I want us to protect. But I also acknowledge the feelings that are more than just friendship. Let's explore this together while keeping our bonds strong. We can navigate this new avenue at our own pace, supporting, and being there for one another. What do you think?\"}" }

sharonwx54 commented 1 year ago

This reverse engineering code has been done. The next step is to move all code from Sotopia repo back to this repo.