Sinaptik-AI / pandas-ai

Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
https://pandas-ai.com
Other
12.55k stars 1.21k forks source link

Examples of logger supported by SmartDataLake #542

Closed krishnashed closed 3 months ago

krishnashed commented 1 year ago

🐛 Describe the bug

Could you provide some examples of logger and memory context supported by SmartDataLake ?

nautics889 commented 1 year ago

Logs can be accessed by .logs property of SmartDatalake instance.

Basing on the example of using SmartDatalake, you can add something like the following:

...

llm = OpenAI()
dl = SmartDatalake(
    [employees_df, salaries_df],
    config={
        "llm": llm,
        "verbose": True,
        "enable_cache": False
    },
)
response = dl.chat("Who gets paid the most?")
print(response)

print(f"Logs: {pprint.pformat(dl.logs)}")  # display logs

Output of the code above is:

...

The employee who gets paid the most is Olivia.
Logs: [{'msg': 'Question: Who gets paid the most?', 'level': 20}, {'msg': 'Running PandasAI with openai LLM...', 'level': 20}, {'msg': 'Prompt ID: ***', 'level': 20}, {'msg': '\n 

The .logs property represents a list of all the messages have been added to the logger during the runtime. The list of logs has the next schema:

[
    {"msg": "<content_of_log_message>", "level": "integer_of_log_level"}  # each log stored in a separate dict in this list
]

As for

memory context

I didn't get what you mean actually. If you want to get the latest generated code for a question, you can accessed it by:

>>> dl.last_code_generated
'# TODO import all the dependencies required\nimport pandas as pd\n\n# Analyze the data\n# 1. Prepare: Preprocessing and cleaning data if necessary\n# 2. Process: Manipulating data for analysis (grouping, filtering ...'

Same for errors:

>>> dl.last_error
None

And for all this attributes:

>>> pprint.pprint([attr for attr in dir(dl) if attr.startswith('last')])
['last_code_executed',
 'last_code_generated',
 'last_error',
 'last_prompt',
 'last_prompt_id',
 'last_result']

SmartDatalake object contains those attribute above, they are able to give you complete context of the previous code execution.