Closed gauss5930 closed 8 months ago
Hi,
Thanks for the interest! Feel free to ask any additional follow up questions:
In general, we found summarization to help -- it more concisely summarizes results from agents. I've attached the code below -- we use chatGPT as the summarization model.
import openai
import json
import numpy as np
import time
import pickle
from tqdm import tqdm
def parse_bullets(sentence):
bullets_preprocess = sentence.split("\n")
bullets = []
for bullet in bullets_preprocess:
try:
idx = bullet.find(next(filter(str.isalpha, bullet)))
except:
continue
bullet = bullet[idx:]
if len(bullet) != 0:
bullets.append(bullet)
return bullets
def filter_people(person):
people = person.split("(")[0]
return people
def generate_answer(answer_context):
try:
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo-0301",
messages=answer_context,
n=1)
except:
print("retrying due to an error......")
time.sleep(20)
return generate_answer(answer_context)
return completion
def summarize_message(agent_contexts):
prefix_string = "Here are a list of opinions from different agents: "
for agent in agent_contexts:
agent_response = agent[-1]["content"]
response = "\n\n One agent response: ```{}```".format(agent_response)
prefix_string = prefix_string + response
prefix_string = prefix_string + "\n\n Write a summary of the different opinions from each of the individual agent."
agent_context = [{"role": "user", "content": prefix_string}]
completion = generate_answer(agent_context)
content = completion["choices"][0]["message"]["content"]
return content
def construct_message(summary, question):
prefix_string = "Here is a summary of responses from other agents: {}".format(summary)
prefix_string = prefix_string + "\n\n Use these opinions carefully as additional advice, can you provide an updated answer? Make sure to state your answer at the end of the response.".format(question)
# prefix_string = prefix_string + "\n\n Using these opinions, can you provide an updated answer? Make sure to state your answer at the end of the response.".format(question)
return {"role": "user", "content": prefix_string}
def construct_assistant_message(completion):
content = completion["choices"][0]["message"]["content"]
return {"role": "assistant", "content": content}
def parse_answer(sentence):
parts = sentence.split(" ")
# Sequentially parse for the last number in the sentence
for part in parts[::-1]:
try:
answer = float(part)
return answer
except:
continue
def most_frequent(List):
counter = 0
num = List[0]
for i in List:
current_frequency = List.count(i)
if current_frequency > counter:
counter = current_frequency
num = i
return num
if __name__ == "__main__":
answer = parse_answer("My answer is the same as the other agents and AI language model: the result of 12+28*19+6 is 550.")
agents = 4
rounds = 2
np.random.seed(0)
evaluation_round = 100
scores = []
generated_description = {}
for round in tqdm(range(evaluation_round)):
a, b, c, d, e, f = np.random.randint(0, 30, size=6)
answer = a + b * c + d - e * f
agent_contexts = [[{"role": "user", "content": """What is the result of {}+{}*{}+{}-{}*{}? Make sure to state your answer at the end of the response.""".format(a, b, c, d, e, f)}] for agent in range(agents)]
content = agent_contexts[0][0]['content']
question_prompt = "We seek to find the result of {}+{}*{}+{}-{}*{}?".format(a, b, c, d, e, f)
for round in range(rounds):
for i, agent_context in enumerate(agent_contexts):
if round != 0:
# agent_contexts_other = agent_contexts[:i] + agent_contexts[i+1:]
# message = construct_message(agent_contexts_other, question_prompt)
summary = summarize_message(agent_contexts)
message = construct_message(summary, question_prompt)
agent_context.append(message)
print("message: ", message)
completion = generate_answer(agent_context)
# try:
# completion = openai.ChatCompletion.create(
# model="gpt-3.5-turbo-0301",
# messages=agent_context,
# n=1)
# except:
# time.sleep(20)
# completion = openai.ChatCompletion.create(
# model="gpt-3.5-turbo-0301",
# messages=agent_context,
# n=1)
assistant_message = construct_assistant_message(completion)
agent_context.append(assistant_message)
print(completion)
text_answers = []
for agent_context in agent_contexts:
text_answer = string = agent_context[-1]['content']
text_answer = text_answer.replace(",", ".")
text_answer = parse_answer(text_answer)
if text_answer is None:
continue
text_answers.append(text_answer)
# print("text_answer: ", text_answer, answer)
# if text_answer == answer:
# scores.append(1)
# else:
# scores.append(0)
generated_description[(a, b, c, d, e, f)] = (agent_contexts, answer)
try:
text_answer = most_frequent(text_answers)
if text_answer == answer:
scores.append(1)
else:
scores.append(0)
except:
continue
print("performance:", np.mean(scores), np.std(scores) / (len(scores) ** 0.5))
json.dump(generated_description, open("summarize_math_{}_{}.json".format(agents, rounds), "w"))
# pickle.dump(generated_description, open("math_short_agents{}_rounds{}.p".format(agents, rounds), "wb"))
import pdb
pdb.set_trace()
print(answer)
print(agent_context)
Hello yilundu! I am contacting you because there is good news about the implementation of multi-agent debate using the open-source LLMs that I mentioned earlier.
The project was inspired by the question "Is multi-agent debate effective on open-source LLMs?" I conducted the 'LLM Agora', a project that utilized several open-source LLMs and examined whether they were actually effective through multi-agent debate to resolve this question. Finally, I am reaching out as our project has been completed.
A brief summarization of the result is as follows:
Through the LLM Agora project, I was able to confirm that the multi-agent debate proposed in the "Improving Factuality and Reasoning in Language Models through Multiagent Debate" is effective not only with proprietary models but also with open-source models. The experimental results of the project have been extensively documented in my GitHub Repository, and LLM Agora has been implemented on the HuggingFace Space, enabling the verification of the effectiveness of multi-agent debate using various open-source LLMs.
I was inspired by the multi-agent debate introduced in your paper, which led to the initiation of the LLM Agora project. I am extremely grateful for your contribution to writing such an outstanding paper. It would be a tremendous honor if you and the authors of the paper could take a look at my project!
Great thanks!! Looks super cool -- I added a reference to the repo in the README on this repo 🙂
Hello, I read your paper with great interest and have a question, which is why I'm posting it as an issue.
In the paper's 3.3 Analysis section, mentioned that performance improved by summarizing the responses of LM Agents into one due to the challenges of managing excessively long context lengths when concatenating LM Agent responses. Could you please provide more details about the setup for this summarization? I have organized my questions as follows:
I want to express my gratitude once again for conducting such outstanding research and for crafting this paper. I'm currently involved in a project to implement multiagent-debate using various open-source LLMs, and your answers to these questions would be very helpful!