microsoft / FLAML

A fast library for AutoML and tuning. Join our Discord:
MIT License
3.76k stars 495 forks source link

group chat for visualization #1213

Closed sonichi closed 9 months ago

sonichi commented 10 months ago

Why are these changes needed?

A group chat example for data visualization.

Related issue number


victordibia commented 10 months ago

@sonichi .. please try the following prompts and let me know. I found that this results in more concrete examples of the critique requesting changes. Perhaps you can integrate into the notebook and we share ?

llm_config = {"config_list": config_list_gpt4, "request_timeout": 1220}
human_proxy = autogen.UserProxyAgent(
   system_message="A human admin.",
   code_execution_config={"last_n_messages": 3, "work_dir": "groupchat"},
coder = autogen.AssistantAgent(
    name="Coder",  # the default assistant agent is capable of solving problems with code
critic = autogen.AssistantAgent(
    system_message="""Critic. You are a helpful assistant highly skilled in evaluating the quality of a given visualization code by providing a score from 1 (bad) - 10 (good) while providing clear rationale. YOU MUST CONSIDER VISUALIZATION BEST PRACTICES for each evaluation. Specifically, you can carefully evaluate the code across the following dimensions
- bugs (bugs):  are there bugs, logic errors, syntax error or typos? Are there any reasons why the code may fail to compile? How should it be fixed? If ANY bug exists, the bug score MUST be less than 5.
- Data transformation (transformation): Is the data transformed appropriately for the visualization type? E.g., is the dataset appropriated filtered, aggregated, or grouped  if needed? If a date field is used, is the date field first converted to a date object etc?
- Goal compliance (compliance): how well the code meets the specified visualization goals?
- Visualization type (type): CONSIDERING BEST PRACTICES, is the visualization type appropriate for the data and intent? Is there a visualization type that would be more effective in conveying insights? If a different visualization type is more appropriate, the score MUST BE LESS THAN 5.
- Data encoding (encoding): Is the data encoded appropriately for the visualization type?
- aesthetics (aesthetics): Are the aesthetics of the visualization appropriate for the visualization type and the data?

YOU MUST PROVIDE A SCORE for each of the above dimensions.
{bugs: 0, transformation: 0, compliance: 0, type: 0, encoding: 0, aesthetics: 0}
Do not suggest code. 
Finally, based on the critique above, suggest a concrete list of actions that the coder should take to improve the code.

groupchat = autogen.GroupChat(agents=[human_proxy, coder, critic], messages=[], max_round=20)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)

human_proxy.initiate_chat(manager, message="download data from and show me a plot that tells me about the amount of each weather . Save the plot to a file. Print the fields in a dataset before visualizing it.")
# type exit to terminate the chat
victordibia commented 10 months ago

@pcdeadeasy for visibility.

sonichi commented 10 months ago

@sonichi .. please try the following prompts and let me know. I found that this results in more concrete examples of the critique requesting changes. Perhaps you can integrate into the notebook and we share ?

llm_config = {"config_list": config_list_gpt4, "request_timeout": 1220}
human_proxy = autogen.UserProxyAgent(
   system_message="A human admin.",
   code_execution_config={"last_n_messages": 3, "work_dir": "groupchat"},
coder = autogen.AssistantAgent(
    name="Coder",  # the default assistant agent is capable of solving problems with code
critic = autogen.AssistantAgent(
    system_message="""Critic. You are a helpful assistant highly skilled in evaluating the quality of a given visualization code by providing a score from 1 (bad) - 10 (good) while providing clear rationale. YOU MUST CONSIDER VISUALIZATION BEST PRACTICES for each evaluation. Specifically, you can carefully evaluate the code across the following dimensions
- bugs (bugs):  are there bugs, logic errors, syntax error or typos? Are there any reasons why the code may fail to compile? How should it be fixed? If ANY bug exists, the bug score MUST be less than 5.
- Data transformation (transformation): Is the data transformed appropriately for the visualization type? E.g., is the dataset appropriated filtered, aggregated, or grouped  if needed? If a date field is used, is the date field first converted to a date object etc?
- Goal compliance (compliance): how well the code meets the specified visualization goals?
- Visualization type (type): CONSIDERING BEST PRACTICES, is the visualization type appropriate for the data and intent? Is there a visualization type that would be more effective in conveying insights? If a different visualization type is more appropriate, the score MUST BE LESS THAN 5.
- Data encoding (encoding): Is the data encoded appropriately for the visualization type?
- aesthetics (aesthetics): Are the aesthetics of the visualization appropriate for the visualization type and the data?

YOU MUST PROVIDE A SCORE for each of the above dimensions.
{bugs: 0, transformation: 0, compliance: 0, type: 0, encoding: 0, aesthetics: 0}
Do not suggest code. 
Finally, based on the critique above, suggest a concrete list of actions that the coder should take to improve the code.

groupchat = autogen.GroupChat(agents=[human_proxy, coder, critic], messages=[], max_round=20)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)

human_proxy.initiate_chat(manager, message="download data from and show me a plot that tells me about the amount of each weather . Save the plot to a file. Print the fields in a dataset before visualizing it.")
# type exit to terminate the chat

Thanks! I updated the example using your suggestion. Please check "Example 2" in the notebook. @victordibia @ahmed-awadallah @pcdeadeasy