LLM Observability: Add optional Langfuse support

aiakubovich commented 9 months ago

Hi! We use the app extensively, but it lacks LLM observability functionality. There should be an easy way to gather user interactions within the AOAI app. Langfuse is an open-source LLM observability platform that supports prompt logging. It offers native integration with OpenAI library which can be achieved by adding just a few lines of code to the underlying Python script. I am wondering whether it would be feasible to add optional Langfuse support into the app, as it seems that only minimal changes are required (a few variables for the Azure Web App and few lines in the AOAI app code). If this is ok, I can attempt to create a pull request for it.

pamelafox commented 9 months ago

Very interesting, I hadn't seen that before, and there's a lot of interest in observability. I think we'd want to get Langfuse deployed on Azure to make sure it was viable for devs to host it on our infrastructure. It looks like that'd be fairly straightforward using Azure Container Apps and the Postgres ACA Add-on, for example. Are you hosting Langfuse on Azure or using their cloud-based version?

aiakubovich commented 9 months ago

@pamelafox, we are hosting Langfuse on Azure as Azure Web App and using Azure PostgreSQL - Flexible Server. Azure Container Apps would be another option. It was very simple to deploy, here is the guide.

tonybaloney commented 9 months ago

Have a look at https://github.com/Azure-Samples/azure-search-openai-demo/issues/1318 as well, we're considering adding this to improve the Application Insights traces for OpenAI

marcklingen commented 9 months ago

Langfuse maintainer here. There are many teams running Langfuse on Azure in prod, mostly on (1) Azure Container Instances + Azure PostgreSQL or (2) everything in Kubernetes. We also support Azure AD for SSO.

Self-hosting guide here: https://langfuse.com/docs/deployment/self-host

Let me know if I can help with anything here @aiakubovich @pamelafox

pamelafox commented 9 months ago

Thanks @marcklingen , I started experimenting with langfuse yesterday, it's a very helpful UI. From my experiment, the changes to this repo are pretty minimal - just setting the env variables and changing the import in app.py - and we can just document those in docs/.

What I'm more interested in is making it super easy to deploy langfuse to Azure using the Azure Developer CLI (azd up). I've started that at https://github.com/pamelafox/langfuse-azure but have some TODOs. One thing that would help is if we could specify the database password separately from the URL, so that we can store those with secret-specific mechanisms. I could also store the entire database URI as a secret, but it's nice when you can view the rest in plaintext.

marcklingen commented 9 months ago

From my experiment, the changes to this repo are pretty minimal - just setting the env variables and changing the import in app.py - and we can just document those in docs/.

Let me know if there's something I can help with.

What I'm more interested in is making it super easy to deploy langfuse to Azure using the Azure Developer CLI (azd up). I've started that at https://github.com/pamelafox/langfuse-azure but have some TODOs. One thing that would help is if we could specify the database password separately from the URL, so that we can store those with secret-specific mechanisms. I could also store the entire database URI as a secret, but it's nice when you can view the rest in plaintext.

That's awesome, I'd love to reference https://github.com/pamelafox/langfuse-azure in our self-hosting docs to make it easier for others to get started on Azure (once you are done with it) – if that's ok for you.

Instead of DATABASE_URL, you can also use DATABASE_HOST, DATABASE_USERNAME, DATABASE_PASSWORD and DATABASE_NAME. Self hosting docs

This was added to langfuse specifically to support external secret management to e.g. rotate secrets easily on Kubernetes. Does this solve for your use case on Azure to specify DATABASE_PASSWORD independently?

pamelafox commented 9 months ago

Thank you! I've got the basic deployment working now. Next step is to add in authentication via AAD, as Azure customers will prefer that, I think.

@aiakubovich Did you all go with AAD for your deployment? Just curious.

aiakubovich commented 9 months ago

@pamelafox, yes, we added few env variables from Azure Active Directory to the Azure Web App config page based on Langfuse self-host guide and now we able to see Azure button on Langfuse UI and it works well for authentication via AD.

Also question regarding this repo - any plans in changing the import in app.py to support Langfuse by default?

pamelafox commented 9 months ago

Here are some screenshots from the langfuse on ACA deployment:

That's from this repo: https://github.com/pamelafox/langfuse-azure Still WIP as you can see from the TODOs, but basic deployment works.

Question for @marcklingen: Is it expected that the traces show null for input/output/metadata?

@aiakubovich We wouldn't make that change across the board, as we want all add-ons like this to be optional (even Azure monitor is optional, though we now default to true as its negligible costs). We could potentially add it as an option based off an env variable, which would look something like this code inside setup_clients:

if (os.getenv("LANGFUSE_HOST")):
   from langfuse.openai import AsyncOpenAI, AsyncAzureOpenAI

For now I'd like to get langfuse-azure template done, document that, and see what feedback we get.

pamelafox commented 9 months ago

@marcklingen Also wondering why "Time to first token" would be empty?

pamelafox commented 9 months ago

FYI, I showed Langfuse on Azure in a livestream today- https://youtube.com/live/Q169GFD_9PI?feature=share The repo is here: https://github.com/Azure-Samples/langfuse-on-azure

I'll add documentation here about that repo shortly.

marcklingen commented 9 months ago

@pamelafox Just checked out the livestream, thanks for featuring and integrating https://github.com/langfuse!

Question for @marcklingen: Is it expected that the traces show null for input/output/metadata?

Yes, the openai monkeypatch integration is the easiest to get started with to log single generations (llm calls, embeddings) to Langfuse. Most users expand to complete traces with more spans (eg for retrieval) or additional events (eg other non-llm api calls). See a full guide here: https://langfuse.com/docs/tracing

Also wondering why "Time to first token" would be empty?

Good one, just created an issue to add this to our openai integration when streaming: https://github.com/langfuse/langfuse/issues/1330

marcklingen commented 9 months ago

By the way, just added a link to Azure-Samples/langfuse-on-azure to our self-hosting guide to make deploying to Azure easier for others as well. thanks again!

https://langfuse.com/docs/deployment/self-host#azure Change: https://github.com/langfuse/langfuse-docs/commit/d81a69e8d647f4fef0f607eba0c53b0756f783a9

pamelafox commented 9 months ago

Thanks for adding langfuse-on-azure to docs! I'll be adding to various samples galleries on Azure soon, to help with discoverability.

Here's my current branch that adds langfuse support to this repo: https://github.com/Azure-Samples/azure-search-openai-demo/compare/main...pamelafox:azure-search-openai-demo:langfuse?expand=1

I don't love using os.getenv() to optionally bring in langfuse based on env variable, since-

I don't like using os.getenv in the global space, it makes testing harder.
I worry about the impact on intellisense for the OpenAI classes, since it's unclear where they're coming from.

I saw your new decorators approach is in the works. I presume we could wrap those around our entire routes, and they would notice any generations called inside them? That would resolve the intellisense issue, but I'd still have to figure out how to make them optional based on getenv. I think I'd have to do a wrapper decorator.

I do really like the way we bring in the OTel instrumentations, they're just optional middleware-

    if os.getenv("APPLICATIONINSIGHTS_CONNECTION_STRING"):
        configure_azure_monitor()
        # This tracks HTTP requests made by aiohttp:
        AioHttpClientInstrumentor().instrument()
        # This tracks HTTP requests made by httpx:
        HTTPXClientInstrumentor().instrument()
        # This tracks OpenAI SDK requests
        OpenAIInstrumentor().instrument()
        # This middleware tracks app route requests:
        app.asgi_app = OpenTelemetryMiddleware(app.asgi_app)

I'd love if I could bring in langfuse similarly. I don't need it to actually be an OTel provider, I just like how I can optionally sprinkle those into the app.

If I could bring in langfuse like that, then I think I could potentially make the change in the main branch. Let me know if you have ideas on that.

In the meantime, I'll work on some documentation/blog post.

bsonnek commented 3 weeks ago

Hi @marcklingen and @pamelafox - This looks like an older thread, but I was wondering if anyone has been able to get the new Langfuse Observe decorators working with the llm 'generation' in this backend app of this repo?

I'm mainly looking at these lines: https://github.com/Azure-Samples/azure-search-openai-demo/blob/009d5e13ba5f9f14bcca684d7a47e1d28fd9023e/app/backend/approaches/chatreadretrieveread.py#L236 Maybe this piece needs to be converted into a method and then the observe decorator could be applied to the method. The new method could then be called from the "run_until_final_call"?

I'm new to decorators and the methods used to call the Azure OpenAI chat and embeddings in this enterprise chat app. I'm struggling to get the observe decorator to show token usage in this backend app. Thanks for all that you both do!

marcklingen commented 3 weeks ago

@bsonnek do you have a wip PR for this? happy to have a look

bsonnek commented 3 weeks ago

Hi @marcklingen I ended up finding a very simple solution after reading through the Langfuse documentation: https://langfuse.com/docs/integrations/openai/python/structured-outputs

All I had to do is change this line: https://github.com/Azure-Samples/azure-search-openai-demo/blob/009d5e13ba5f9f14bcca684d7a47e1d28fd9023e/app/backend/app.py#L31

From: from openai import AsyncAzureOpenAI, AsyncOpenAI To: from langfuse.openai import AsyncAzureOpenAI, AsyncOpenAI

Now I am seeing the generations in the langfuse dashboard. Thanks again for your quick response.

marcklingen commented 3 weeks ago

Hi @marcklingen I ended up finding a very simple solution after reading through the Langfuse documentation: https://langfuse.com/docs/integrations/openai/python/structured-outputs

All I had to do is change this line:

https://github.com/Azure-Samples/azure-search-openai-demo/blob/009d5e13ba5f9f14bcca684d7a47e1d28fd9023e/app/backend/app.py#L31

From: from openai import AsyncAzureOpenAI, AsyncOpenAI To: from langfuse.openai import AsyncAzureOpenAI, AsyncOpenAI

Now I am seeing the generations in the langfuse dashboard. Thanks again for your quick response.

Awesome, thanks for sharing!

Azure-Samples / azure-search-openai-demo

LLM Observability: Add optional Langfuse support #1300