Azure / azure-functions-python-worker

Python worker for Azure Functions.
http://aka.ms/azurefunctions
MIT License
331 stars 100 forks source link

[BUG] Open Telemetry integration with Azure Functions - unnecessary telemetry sent #1342

Open karpikpl opened 8 months ago

karpikpl commented 8 months ago

When function is configured with OTEL using:

from azure.monitor.opentelemetry import configure_azure_monitor
configure_azure_monitor()

duplicated (4x) logs are sent to application insights. It seems like using env variable APPLICATIONINSIGHTS_CONNECTION_STRING is not supported because it configures both Application Insights and Open Telemetry?

When configured with:

configure_azure_monitor(connection_string=os.environ.get("APPINSIGHTS"))

it seems better but hundreds of extra logs are sent to Application Insights image

where all the calls to application insights are instrumented and logged in application insights

Investigative information

Please provide the following:

Repro steps

Provide the steps required to reproduce the problem:
from logging import getLogger
import os

import azure.functions as func
from opentelemetry import trace
from opentelemetry.trace import SpanKind
import requests

from shared_code import helper

# Set up logging
configure_azure_monitor(os.environ.get("APPINSIGHTS"))

# Get a tracer for the current module.
tracer = trace.get_tracer(__name__)
logger = getLogger(__name__)

def main(req: func.HttpRequest) -> func.HttpResponse:
    """Http trigger function example using OpenTelemetry with Azure Monitor."""
    logger.info("logger.info: Python HTTP trigger function processed a request.")

    name = req.params.get("name")
    if not name:
        try:
            req_body = req.get_json()
        except ValueError:
            pass
        else:
            name = req_body.get("name")

    if not name:
        name = "no-name"

    # Start a new span with the name "hello".
    # This also sets this created span as the current span in this context.
    # This span will be exported to Azure Monitor as part of the trace.
    with tracer.start_as_current_span(name, kind=SpanKind.SERVER) as span:
        # log a message before making an HTTP call
        span.add_event("Making an HTTP call event")
        logger.info("Making an HTTP call")
        number = helper.double(2)
        url = os.environ["URL"]
        logger.warning("got a number!", extra={"number": number})

        response = requests.get(url, timeout=5)
        json_data = response.json()
        # log a message after making an HTTP call with URL in extra properties
        logger.info("HTTP call complete", extra={"url": url})

        title = json_data["title"]

        return func.HttpResponse(f"number: {number}. Title: {title}")

host.json:

{
  "version": "2.0",
  "logging": {
    // "applicationInsights": {
    //   "samplingSettings": {
    //     "isEnabled": false,
    //     "excludedTypes": "Request"
    //   }
    // },
    "logLevel": {
      "default": "Warning",
      "Host.Results": "Information",
      "Function": "Information",
      "Host": "Error",
      "Microsoft": "Error"
    }
  },
  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle",
    "version": "[3.*, 4.0.0)"
  },
  "extensions": {
    "serviceBus": {
      "messageHandlerOptions": {
        "maxConcurrentCalls": 16
      }
    }
  },
  "concurrency": {
    "dynamicConcurrencyEnabled": true,
    "snapshotPersistenceEnabled": true
  }
}

Expected behavior

Provide a description of the expected behavior.

Each log message is logged ONCE in application insights. No logs related to application insights uploads are visible - only application logs.

Actual behavior

Provide a description of the actual behavior observed.

Ton of unnecessary logs in application insights - see screenshot above

Known workarounds

Provide a description of any known workarounds.

Not aware of any

Contents of the requirements.txt file:

Provide the requirements.txt file to help us find out module related issues.
azure-functions==1.17.0
azure-identity==1.14.1
azure-mgmt-resource==23.0.1
azure-monitor-opentelemetry==1.0.0
msrestazure==0.6.4
#opentelemetry-instrumentation-requests==0.41b0
requests==2.31.0

Related information

Provide any related information
lzchen commented 8 months ago

@karpikpl

Thanks for trying out the Azure monitor OpenTelemetry Distro. You should only be using APPLICATIONINSIGHTS_CONNECTION_STRING env var to configure your connection string. I will try to explain the 4x logs you are most likely receiving in the Trace table.

Now the first thing we have to point out is that for monitoring, the functions worker itself sends logging telemetry itself without the use of the azure monitor sdk (the call to configure_azure_monitor()). This is a feature gap currently between the monitoring sdk and the functions worker in which instrumenting with the SDK will send duplicate logs. Our recommendation to customers is to use solely the SDK as it will allow much more rich telemetry and features than using the built in one provided by azure functions worker. You can turn off the azure functions telemetry logger by clearing the list of handlers of your logger.

root_logger = logging.getLogger()
for handler in root_logger.handlers[:]:
    root_logger.removeHandler(handler)

Be sure to call the above BEFORE any loggers or the call to configure_azure_monitor() is setup. Eventually when our monitoring story with functions becomes more developed, we will automatically disable the azure functions worker logger but for now this is the workaround to not get duplicate logs.

Now the second thing that is causing the other unwanted logs is the fact that configure_azure_monitor() ALSO adds a custom telemetry handler to the ROOT logger by default. This means that ALL logs sent with any logger will send telemetry to your application insights instance. This will include the log.INFO entries you see in your pasted snapshot that pertain to the SDK itself (which we don't want to see because it is unrelated to your application). The simple solution is to use named loggers (which you are already doing) in your application. We have implemented a way to specify which logger to configure when calling configure_azure_monitor() instead of the ROOT logger. This feature will be available in the next release (sometime next week). For now, a workaround could be to set your logger level to WARNING and make your application level calls severity higher than INFO so that the info logs that are sent by the SDK itself won't be collected.

karpikpl commented 8 months ago

the comment from Maciej with log level setting also helped

logging.getLogger("azure.core.pipeline.policies.http_logging_policy").setLevel(logging.WARNING)
logging.getLogger("azure.monitor.opentelemetry.exporter.export").setLevel(logging.WARNING)
macieyng commented 6 months ago

@karpikpl 🙇

greatvovan commented 2 months ago

I had the same issue and fixed it exactly as the author posted above.

To be honest, I don't think there is anything unexpected here because the behaviour is consistent with the standard logging framework of Python. It is not uncommon that you add a library into your project and you suddenly get some alien logs from the library code. On the contrary, it would be unexpected if those logs would not be sent according to the LogHandler's destination – that would made me really puzzled and willing to figure out why 😃

I think this issue might require documenting and probably a parameter disabling the internal logging or setting it to WARNING and above.

lzchen commented 2 months ago

@greatvovan

We have this documented here. Also, there is ongoing work to make this work automatically without the user having to add this code.