nyanp / chat2plot

chat to visualization with LLM
MIT License
179 stars 22 forks source link

📈 Chat2Plot - interactive & safe text-to-visualization with LLM

This library uses LLMs to generate:

  1. Visualization
  2. High-level chart specifications in json (You can choose between a simple or vega-lite format)
  3. Explanation

from natural language requests for given data.

Chat2Plot does not generate executable code or SQL from the LLM, so you can safely generate visualizations.

demo: https://chat2plot-sample.streamlit.app/

sample

Quick Start

pip install chat2plot
import os
import pandas as pd
from chat2plot import chat2plot

# 1. Set api-key
os.environ["OPENAI_API_KEY"] = "..."

df = pd.read_csv(...)

# 2. Pass a dataframe to draw
c2p = chat2plot(df)

# 3. Make a question about the data
result = c2p("average target over countries")
result.figure.show()  # draw a plot
print(result.config)  # get a config (json / dataclass)
print(result.explanation)  # see the explanation generated by LLM

# you can make follow-up request to refine the chart
result = c2p("change to horizontal-bar chart")
result.figure.show()

Why Chat2Plot?

Inside Chat2Plot, LLM does not generate Python code, but generates plot specifications in json.

The declarative visualization specification in json is transformed into actual charts in Chat2Plot using plotly or altair, but users can also use json directly in their own applications.

This design limits the visualization expression compared to Python code generation (such as ChatGPT's Code Interpreter Plugin), but has the following practical advantages:

By default, chat2plot uses function calling API.

Examples

Custom language models

gpt-3.5-turbo-0613 is used by default, but you can use other language models.

import pandas as pd
from langchain.chat_models import AzureChatOpenAI
from chat2plot import chat2plot

plot = chat2plot(pd.DataFrame(), chat=AzureChatOpenAI())
ret = plot.query("<your query>")

Vega-lite format

import pandas as pd
from chat2plot import chat2plot

plot = chat2plot(pd.DataFrame(), schema_definition="vega")
ret = plot.query("<your query>")

assert isinstance(ret.config, dict)  # vega-lite format
print(ret.config)

Custom chart definition

import pydantic
import pandas as pd
from chat2plot import chat2plot

class CustomChartConfig(pydantic.BaseModel):
    chart_type: str
    x_axis_name: str
    y_axis_name: str
    y_axis_aggregate: str

plot = chat2plot(pd.DataFrame(), schema_definition=CustomChartConfig)
ret = plot.query("<your query>")

# chat2plot treats the data type you pass as a chart setting
assert isinstance(ret.config, CustomChartConfig)

Specifying output language

You can specify in which language the chart explanations should be output. If not specified, it will return as much as possible in the same language as the user's question, but this option is often useful if you always want output in a specific language.

import pandas as pd
from chat2plot import chat2plot

plot = chat2plot(pd.DataFrame(), language="Chinese")
ret = plot.query("<your query>")

print(ret.explanation)  # explanation 

Privacy preserving

When description_strategy="dtypes" is specified, chat2plot will not send the data content (but just column names) to LLM.

import pandas as pd
from langchain.chat_models import AzureChatOpenAI
from chat2plot import chat2plot

plot = chat2plot(pd.DataFrame(), description_strategy="dtypes")
ret = plot.query("<your query>")

API

A Chat2Plot instance can be created using the chat2plot function.

def chat2plot(
    df: pd.DataFrame,
    schema_definition: Literal["simple", "vega"] | Type[pydantic.BaseModel] = "simple",
    chat: BaseChatModel | None = None,
    function_call: bool | Literal["auto"] = "auto",
    language: str | None = None,
    description_strategy: str = "head",
    custom_deserializer: ModelDeserializer | None = None,
    verbose: bool = False,
) -> Chat2PlotBase:

Once an instance is created, a graph generation request can be made by calling query method, or simply passing the same arguments to the instance (__call__).

def query(self, q: str, config_only: bool = False, show_plot: bool = False) -> Plot:

The default behavior is config_only=False and show_plot=False, i.e. Chat2Plot generates a figure as well as configuration, but does not draw it.

The query method returns Plot object, and this has the following properties:

@dataclass(frozen=True)
class Plot:
    figure: alt.Chart | Figure | None
    config: PlotConfig | dict[str, Any] | None
    response_type: ResponseType
    explanation: str
    conversation_history: list[langchain.schema.BaseMessage] | None