Chainlit / chainlit

Build Conversational AI in minutes ⚡️
https://docs.chainlit.io
Apache License 2.0
6.86k stars 903 forks source link

Add support for displaying pandas DataFrame as an interactive table #1373

Open desertproject opened 4 days ago

desertproject commented 4 days ago

This is an attempt to address issue #1350, and it may serve as a basis for future improvements and better implementations.

To use this functionality, follow these steps:

  1. Convert the DataFrame to JSON: Use the to_json() method with the "split" orientation to convert the DataFrame into a JSON string, formatting the data for display.
  2. Create a DataFrame Element: Pass the JSON string as the content parameter when creating the DataFrame element.
  3. Append and Send: Append the DataFrame element to a message and send it for display.
import chainlit as cl
import pandas as pd

@cl.on_chat_start
async def start():
    iris = pd.read_csv(
        "https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv"
    )

    json_dataframe = iris.to_json(orient="split")

    elements = [
        cl.Dataframe(content=json_dataframe, display="inline", name="Dataframe")
    ]

    await cl.Message(content="This message has a Dataframe", elements=elements).send()
hadarsharon commented 3 days ago

@desertproject Nice one :)

One question though: Instead of expecting (forcing) the user to serialize the dataframe to JSON in a specific orientation, why don't we just accept the DataFrame element and call to_json with orient="split" ourselves behind the scenes, for example in the __post_init__ method?

This way people wouldn't have to serialize/deserialize their dataframes throughout and could just pass a DataFrame at any point without having to worry about the underlying implementation.

dokterbob commented 3 days ago

@desertproject Cool feature! Could you please create an E2E test for it, so we can support it towards the future? I'd also love some screenshots/screengrab of how it is supposed to work and/or a cookbook entry. 🙏🏼 🥺

(Forget the screenshot, I just seen it in the issue.)

desertproject commented 3 days ago

Instead of expecting (forcing) the user to serialize the dataframe to JSON in a specific orientation, why don't we just accept the DataFrame element and call to_json with orient="split" ourselves behind the scenes, for example in the __post_init__ method?

@hadarsharon That's a great idea! I had thought of something similar but wasn’t sure exactly where it would fit. Using the __post_init__ method for this makes perfect sense. Thank you for the suggestion!

desertproject commented 3 days ago

Could you please create an E2E test for it, so we can support it towards the future? I'd also love a cookbook entry. 🙏🏼 🥺

@dokterbob I don’t have any experience with E2E testing or Cypress, but I’d be happy to give it a try! Could you provide some guidance on the specific tests you’d like to see? I can also take care of the cookbook entry without any problem.

dokterbob commented 2 days ago

Could you please create an E2E test for it, so we can support it towards the future? I'd also love a cookbook entry. 🙏🏼 🥺

@dokterbob I don’t have any experience with E2E testing or Cypress, but I’d be happy to give it a try! Could you provide some guidance on the specific tests you’d like to see? I can also take care of the cookbook entry without any problem.

Basically, you'd create a test suite similar to the ones already there: https://github.com/Chainlit/chainlit/tree/main/cypress/e2e

Create a test project setup to use your feature, then test everything you'd normally test for it. E.g. return a dataframe, ensure it's properly rendered. If there's interaction possible, simulate it.

I highly recommend to use something like Claude if you haven't written tests before (but of course, check the work for common slipups!).

Similarly unit tests for Python code would be great as well. Note how the unit tests are laid out to match the Python modules.