Azure / azure-sdk-for-python

This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/python/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-python.
MIT License
4.64k stars 2.84k forks source link

OpenTelemetryExtension for Azure Functions - working POC #29672

Open macieyng opened 1 year ago

macieyng commented 1 year ago

Is your feature request related to a problem? Please describe. We started to integrated OpenTelemetry with Azure Functions. It was tough for me to find all necessary resources and very clear description how to do that. A lot of resources that I've found seem to be outdated or contradict each other or I'm just bad at googling. But yeah, key thing: no clear description how to implement and integrate OpenTelemetry with Azure Functions.

I liked the way OpenCensus could be easily setup as an Azure Functions Extension with just

OpenCensusExtension.configure()

def main(req: func.Request, context: func.Context):
    with context.tracer.span():
        return function()

Describe the solution you'd like Simple and elegant way to easily integrate OpenTelemetry and Azure Functions with just a couple lines of code.

Describe alternatives you've considered None.

Additional context I took OpenCensusExtension approach and have implemented OpenTelemetryExtension that currently is focused on Traces and Logging - implementation below. It also allows instrumentation extensions to be correctly initialized. I think that solution is clear and elegant.

OpenTelemetryExtension.configure(instrumentors=[
    HTTPXClientInstrumentor,
])

def main(req: func.HttpRequest, context: func.Context):
    with context.span():
        return _main(req)

It seems that this could be a drop-in solution for those who have already implemented tracing and logging with OpenCensus.

I don't know how Azure Functions Extensions are currently being perceived, so I would like to know if you think it's okay to use them or this approach has changed or anything in that matter.

I very much would love to hear your feedback on the solution I'm showing you.

import logging
from functools import partial
import os
from typing import List

from azure.functions import AppExtensionBase
from azure.monitor.opentelemetry.exporter import (
    AzureMonitorLogExporter,
    AzureMonitorTraceExporter,
    ApplicationInsightsSampler,
)
from opentelemetry import trace
from opentelemetry._logs import set_logger_provider
from opentelemetry.sdk._logs import LoggerProvider, LoggingHandler
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.trace.propagation.tracecontext import \
    TraceContextTextMapPropagator
from opentelemetry.instrumentation.instrumentor import BaseInstrumentor

"""
SOURCES:

select tracing implementation for azure core
https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/core/azure-core-tracing-opentelemetry#key-concepts
"""

class OpenTelemetryExtension(AppExtensionBase):
    """
    Extension for Azure Functions integration to export traces into Azure
    Monitor. Ensure the following requirements are met:
    1. Azure Functions version is greater or equal to v3.0.15584
    2. App setting PYTHON_ENABLE_WORKER_EXTENSIONS is set to 1
    """

    @classmethod
    def init(cls):
        cls._trace_exporter = None
        cls._instrumentors: List[BaseInstrumentor] = []
        cls._is_otel_ext_configured = False

    @classmethod
    def configure(cls, **options):
        """
        Configure libraries for integrating into OpenTelemetry extension.
        Create Resource object that allows correlation of telemetry with instance 
        and instance group (previously known as cloud role and cloud role instance).
        Initialize tracing related objects that will write traces to Application Insights.
        Initialize logging related objects that will write logs to Application Insights.

        :param options: Options to configure the extension.
        """

        if not os.getenv("APPLICATIONINSIGHTS_CONNECTION_STRING") and not os.getenv("APPINSIGHTS_INSTRUMENTATIONKEY"):
            logging.warning(
                "Application Insights connection string or instrumentation key not found. "
                "OpenTelemetryExtension cannot be configured and operate. "
                "Skipping OpenTelemetry configuration."
            )
            return 

        # Added April 3rd
        if cls._is_otel_ext_configured is True:
            # To avoid multiple instances of exporters, processors, logger handlers, etc.
            # This could cause classes to be initialized as many times as there are functions in the app.
            logging.debug("OpenTelemetryExtension is already configured. Skipping configuration.")
            return

        cls._is_otel_ext_configured = True

        # Configure resource
        resource = Resource.create(
            {
                "service.name": os.getenv("WEBSITE_SITE_NAME"),
                "service.instance.id": os.getenv("WEBSITE_INSTANCE_ID"), 
            }
        )

        # Configure tracing 
        cls._trace_exporter = AzureMonitorTraceExporter()
        processor = BatchSpanProcessor(cls._trace_exporter)
        sampler = ApplicationInsightsSampler(1.0)
        provider = TracerProvider(resource=resource, sampler=sampler)
        provider.add_span_processor(processor)
        trace.set_tracer_provider(provider)

        # Configure logging
        logger_provider = LoggerProvider(resource=resource)
        set_logger_provider(logger_provider)
        log_exporter = AzureMonitorLogExporter()
        logger_provider.add_log_record_processor(BatchLogRecordProcessor(log_exporter))
        handler = LoggingHandler(level=logging.NOTSET, logger_provider=logger_provider)
        logging.getLogger().addHandler(handler)

        # Configure instrumentors
        cls._instrumentors = options.get("instrumentors", [])

    @classmethod
    def pre_invocation_app_level(cls, logger, context, func_args={}, *args, **kwargs):
        """
        An implementation of pre invocation hooks on Function App's level.
        The Python Worker Extension Interface is defined in
        https://github.com/Azure/azure-functions-python-library/
        blob/dev/azure/functions/extension/app_extension_base.py
        """

        if not cls._is_otel_ext_configured:
            logger.warning(
                "Please call OpenTelemetryExtension.configure() after the import "
                "statement to ensure everything is setup correctly."
            )
            return

        # Extracts the context from the incoming request
        ctx = TraceContextTextMapPropagator().extract(
            carrier={"traceparent": context.trace_context.trace_parent}
        )

        # Set up the span
        tracer = trace.get_tracer(__name__)

        # Partial initialization of the span to be initialized later
        span = partial(
            tracer.start_as_current_span,
            context.function_name,
            context=ctx,
        )

        # Add span to context so it can be initialized on a function level
        setattr(context, "span", span)

        #  Instrument libraries
        for instrumentor in cls._instrumentors:
            instrumentor().instrument()

    @classmethod
    def post_invocation_app_level(
        cls, logger, context, func_args, func_ret, *args, **kwargs
    ):
        """
        An implementation of post invocation hooks on Function App's level.
        The Python Worker Extension Interface is defined in
        https://github.com/Azure/azure-functions-python-library/
        blob/dev/azure/functions/extension/app_extension_base.py
        """
        if not cls._is_otel_ext_configured:
            logging.debug("OpenTelemetryExtension was not configured. Post invocation hooks will not be executed.")
            return 

        # Span needs to be cleaned up after the function is executed to avoid memory leaks
        if getattr(context, "span", None):
            del context.span

        # Uninstrument libraries
        for instrumentor in cls._instrumentors:
            instrumentor().uninstrument()

Mentioning @lzchen because our ways have crossed a few times with OpenCensus 😅

lzchen commented 1 year ago

@jeremydvoss

akarray commented 1 year ago

@macieyng It's very interesting as extention. Have you finally find a solution for this problem?

Thank you

macieyng commented 1 year ago

Well. Right now we're using it in production with some minor changes for almost two months. It works as expected. Some issues was discovered, because we had implemented a distributed tracing (we also have internal tracing for finding more specific issues).

We have our internal library and I've wrapped this and OpenCensusExtension in configure_tracing() function that uses Azure SDK env variable AZURE_SDK_TRACING_IMPLEMENTATION to detect which tracing library should be used - this was done mainly for smooth transition from OC to OTEL.

If you need something that just works, you can easily take this example as is (probably minor changes will be required) and slap it into your codebase.

If you have any specific question feel free to ask.

marvinbuss commented 10 months ago

In case someone looks for an e2e sample, I have implemented this in a similar manner in this repository with opentelemetry and fastapi: https://github.com/PerfectThymeTech/AzureFunctionPython/blob/main/code/function/wrapper/__init__.py

The difference is that I have included the opentelemetry setup in the fastapi app startup instead of using an Azure Function Extension. I prefer that approach since some instrumentations require specific configurations (tracer_provider, metrics_provider, etc.). The image below shows the data captured in Application Insights: image

BigMorty commented 10 months ago

@macieyng - Hello, I am a PM on the Azure Functions team and would like to chat with you about using OTel in your Azure Functions. If you are willing to chat, please email me at mikemort at microsoft dot com.

macieyng commented 10 months ago

☝️ FYI We got in touch.

lzchen commented 10 months ago

@BigMorty

We have plans to improve our OT based monitoring solution in Azure functions this upcoming semester. Feel free to reach out if you would like more context about this.