expectedparrot / edsl

Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.
https://docs.expectedparrot.com
MIT License
97 stars 14 forks source link

Results `to_pandas()` method is turning a list into a string #468

Open rbyh opened 1 month ago

rbyh commented 1 month ago
image
rbyh commented 1 month ago

In this example a have responses to a QuestionCheckBox question which is a list of strings. When I convert the results to a dataframe the lists of selected options are converted into strings that look like lists

johnjosephhorton commented 1 month ago

@rbyh Can you investigate best practices with pandas here? Pandas is meant to be a 'flat' format, so don't know what we should be doing.

rbyh commented 1 month ago

Pandas should preserve the format, eg, here a column that is lists of strings remains in this format:

import pandas as pd

# Example DataFrame with a list in a column
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Interests': [['reading', 'cycling'], ['painting'], ['writing', 'cooking']]}
df = pd.DataFrame(data)

type(df['Interests'][0])

Will return:

<class 'list'>

I think the issue is the intermediary CSV conversion steps in to_pandas(). I think we can skip them with this fix:

import pandas as pd
import io

    def to_pandas(self, remove_prefix: bool = False) -> pd.DataFrame:
        """Convert the results to a pandas DataFrame, ensuring that lists remain as lists.

        :param remove_prefix: Whether to remove the prefix from the column names.

        """
        df = pd.DataFrame(self.data)  

        if remove_prefix:
            # Optionally remove prefixes from column names
            df.columns = [col.split('.')[-1] for col in df.columns]

        df_sorted = df.sort_index(axis=1)  # Sort columns alphabetically
        return df_sorted
johnjosephhorton commented 1 month ago

It's a good fix but it broke some other tests in a complicated way, so I'm not quite ready to implement.

rbyh commented 1 month ago

Bumping this. As I'm working on examples for extracting themes and turning them into checkbox question options I am frequently needing to add a step transforming the list-as-string into a true list.

rbyh commented 1 month ago
image