Canner / WrenAI

🚀 Open-source SQL AI Agent for Text-to-SQL. Make Text2SQL Easy! 🙌
https://getwren.ai/oss
GNU Affero General Public License v3.0
1.73k stars 155 forks source link

feature(wren-ai-service): integrate Langfuse SDK to represent the evaluation result #395

Closed paopa closed 3 months ago

paopa commented 3 months ago

This PR aims to integrate the decorator-based Langfuse SDK to collect data for the evaluation process. By default, the environment setup is disabled. To enable Langfuse, follow these steps:

# .env.dev file
# Langfuse configuration
LANGFUSE_ENABLE=True
LANGFUSE_SECRET_KEY=
LANGFUSE_PUBLIC_KEY=
LANGFUSE_HOST=

Fill in the environment variables. You can generate the secret key, public key, and host by following the instructions at Langfuse Documentation.

Currently, we don't set the session ID. When implementing the main process for the evaluation, refer to the following code to set up the session ID.

@observe()
async def main(user_id: str):
    langfuse_context.update_current_trace(
        user_id=user_id, 
        session_id=f"{user_id}_{uuid.uuid4()}"
    )
    return await story()

async def run():
    await asyncio.gather(main("foo"), main("bar"))
    langfuse_context.flush()

asyncio.run(run())

Screenshots

image image image image image
cyyeh commented 3 months ago

Overall LGTM, I would like to discuss with you the capture_input part

paopa commented 3 months ago

After a discussion with @cyyeh, we decided not to capture the span input for all steps. Other suggestions, such as the Langfuse debug mode and authentication check, are interesting features. However, we will not include them in this PR as it is focused on the evaluation framework.