census-instrumentation / opencensus-python

A stats collection and distributed tracing framework
Apache License 2.0
668 stars 249 forks source link

AzureLogHandler does not seem to work from Celery tasks #900

Open Bastien-Brd opened 4 years ago

Bastien-Brd commented 4 years ago
Python 3.5.2
Django 2.2.10
celery==4.2.2
opencensus==0.7.7
opencensus-context==0.1.1
opencensus-ext-azure==1.0.2
opencensus-ext-django==0.7.2
opencensus-ext-logging==0.1.0

After a loooong investigation and many experiments with various ways of configuring logging in Celery tasks and django, I am pretty sure that for some reason, the AzureLogHandler does not send any log to Azure from a task running in a Celery worker.

To make sure of that, in my django project, I setup Celery logging by discarding the default Celery logging behaviour (the default is misleading because it overrides the root logger) and instead I use the django logging configuration (from django's settings.py), and in addition I manually make sure the celery logger has both a local file log handler (making sure I capture all log messages locally for debugging this issue) and a AzureLogHandler. You can see how this is configuring in the code snippet below:

import os
import logging

from celery import Celery
from celery.signals import setup_logging
from opencensus.ext.azure.log_exporter import AzureLogHandler

from django.conf import settings

os.environ.setdefault("DJANGO_SETTINGS_MODULE", "edgefolio.settings")

# Create the Celery app (called "api")
app = Celery('api')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)

def add_azure_log_handler_to_logger(logger, propagate=True):
    """
    Given a logger, add a AzureLogHandler to it
    :param logger:
    :param propagate:
    :return:
    """
    if settings.AZURE_APP_INSIGHTS_INSTR_KEY:
        formatter = logging.Formatter("[Celery/%(processName)s] %(message)s")
        # Azure Handler:
        azure_log_handler = AzureLogHandler(
            connection_string='InstrumentationKey=%s' % settings.AZURE_APP_INSIGHTS_INSTR_KEY)
        azure_log_handler.setFormatter(formatter)
        azure_log_handler.setLevel(logging.INFO)
        logger.addHandler(azure_log_handler)
        logger.setLevel(logging.INFO)
        logger.propagate = propagate
    return logger

@setup_logging.connect
def setup_loggers(*args, **kwargs):
    """
    Using the celery "setup_logging" signal to override and fully define the logging configuration for Celery
    :param args:
    :param kwargs:
    :return:
    """
    # Configure Celery logging from the Django settings' logging configuration
    from logging.config import dictConfig
    from django.conf import settings
    dictConfig(settings.LOGGING)

    # Test the root logger (configured in django settings to log to Azure as well
    logger = logging.getLogger('root')
    logger.warning('TRYING LOGGING FROM [%s]' % logger.name)

    # Configure the Celery top level logger
    logger = logging.getLogger('celery')
    # Add a local file log handler to make sure we capture every message locally
    logger.addHandler(logging.FileHandler("/data/log/worker/importer.log"))
    # In addition, also manually add a AzureLogHandler to it (duplicate with the root's handler)
    logger = add_azure_log_handler_to_logger(logger, propagate=False)
    # Log a test warning message
    logger.warning('TRYING LOGGING FROM [%s]' % logger.name)

    # Log a test warning message from a lower-level celery logger
    logger = logging.getLogger('celery.task')
    logger.warning('TRYING LOGGING FROM [%s]' % logger.name)

    # Log a test warning message from a specific django app task logger
    logger = logging.getLogger('etl.tasks')
    logger.warning('TRYING LOGGING FROM [%s]' % logger.name)

The result of this setup is that the log messages logged from the main celery process are all sent to both the local file and to Azure Monitor without problem, however log messages generated from inside the celery tasks are sent to the local file (proving that the logging configuration above is used in the tasks as well) but NOT to Azure Monitor. It appears only logs from the main Celery process are sent to Azure, but none from the Celery Worker processes.

I wonder if this has to do with the way AzureLogHandler is implemented (using threading and similar wizardry) and the way Celery worker processes run.

SanthoshMedide commented 4 years ago

The program runs on a Kubernetes cluster. Logs work on local but not sent to AppInsights Azure ----------------------------------logging.yaml--------------------------------------

version: 1
disable_existing_loggers: False
formatters:
standard:
format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
error:
format: "%(asctime)s- %(levelname)s <PID %(process)d:%(processName)s> %(name)s.%(funcName)s(): %(message)s"

handlers:
console:
class: logging.StreamHandler
level: DEBUG
formatter: standard
stream: ext://sys.stdout

null_handler:
class: logging.NullHandler
level: INFO

info_file_handler:
class: logging.handlers.RotatingFileHandler
level: INFO
formatter: standard
filename: info.log
maxBytes: 10485760 # 10MB
backupCount: 20
encoding: utf8

error_file_handler:
class: logging.handlers.RotatingFileHandler
level: ERROR
formatter: error
filename: logs/errors.log
maxBytes: 10485760 # 10MB
backupCount: 20
encoding: utf8

azure_handler:
class: opencensus.ext.azure.log_exporter.AzureLogHandler
level: INFO
formatter: standard
backupCount: 20
encoding: utf8
connection_string: 'instrumentation_key_here'
timeout: 30

loggers:
data_repo:
level: DEBUG
handlers: [console,info_file_handler, error_file_handler,azure_handler]
propagate: no
data_controller:
level: DEBUG
handlers: [console,info_file_handler, error_file_handler,azure_handler]
propagate: no

root:
level: DEBUG
handlers: [console, info_file_handler, error_file_handler,azure_handler]

-------------------------------setting the logging(set_logging.py)----------------------------

def setup_logging(
    default_path='logging.yaml',
    default_level=logging.DEBUG
):
    path = default_path
    print("set logging")
    if os.path.exists(path):
        with open(path, 'rt') as f:
            config = yaml.safe_load(f.read())
            config['handlers']['azure_handler']['connection_string']=cfg.APP_INSIGHTS_CONNECTION
        logging.config.dictConfig(config)
    else:
        logging.basicConfig(level=default_level)

----------------------------my program python file---------------------------------------

import traceback
import logging

logger = logging.getLogger(__name__)
def run():
   logger.info("Run started")
   try:
       access data 
   except Exception as e:
            params = {_some information in dict format_}
            logger.error("Failed to pull the data required "+str(e),exc_info=True, extra=params)

------------------------Here is the possible ERROR-----------------------------------------


2020-06-14 02:00:43,118 - data_repo - ERROR - Failed to pull the data required 'NoneType' object is not subscriptable
2020-06-14 02:43:40,986 - azure.storage.common.storageclient - INFO - Client-Request-ID=c2393a78-ae0c-11ea-9f4c-a683e7151156 
Outgoing request: Method=HEAD, Path=, Query={'snapshot': None, 'timeout': None, 'restype': None}, Headers={'x-ms-version': '2018-03-28', 
'User-Agent': 'Azure-Storage/1.4.2-1.5.0 (Python CPython 3.6.0; Darwin 18.7.0)',
 'x-ms-client-request-id': 'c2393a78-ae0c-11ea-9f4c-a683e7151156', 'x-ms-date': 'Sun, 14 Jun 2020 07:00:40 GMT', 'Authorization': 'REDACTED'}.```

All logs are working when I run them on local.
I am running application on Kubernetes 
Logs from the main file are going to ApplicationInsights (Main file runs on a pod)
While logs from the rest of the files are not going to AppInsights (All these files are on diff pod)
There's no config difference between these pods
lzchen commented 4 years ago

@SanthoshMedide Can you open up a new issue for this? It seems to be unrelated to this thread.

lzchen commented 4 years ago

@SanthoshMedide Tracked by: https://github.com/census-instrumentation/opencensus-python/issues/908

CBurini commented 3 years ago

@lzchen I'm not sure why this issue was closed. We've encountered the same bug than @Bastien-Brd in two different projects.

We use the last version of the SDK (1.0.4). Logs emitted by Celery tasks are not sent to Azure (but are correctly logged by other loggers). Logs emitted elsewhere in the project, or by the main process of Celery are correctly sent to Azure.

trevorphillipscoding commented 3 years ago

AzureLogHandler doesn't work with celery for me either. Any update on this?

ubikusss commented 3 years ago

@Bastien-Brd did you find a work-around at least?

jesseinit commented 3 years ago

Good to know I wasn't the only one facing this. Seems there is no work-around yet.

GuerreiroLeonardo commented 3 years ago

Having the same issue

Nikash commented 2 years ago

Running into a similar issue with a child process spawned using multiprocessing.Process.

Our logger is configured to log to StreamHandler and AzureLogHandler. The logs are correctly sent to both handlers (and show up in Azure Monitor) from the main thread. But the logs generated using the same logger from within the child process only shows up on STDOUT and not in Azure Monitor.

This leads me to suspect the issue is generic to multithreading / multiprocessing and not just specific to Celery task threads.

br0cky commented 2 years ago

Also having same issue, did anyone find a solution? Thanks

KonradAdamczyk commented 2 years ago

I'm also facing this issue. Any update?

rockTA commented 2 years ago

Having the same problem outside Celery, but with a child process spawned using multiprocessing.Process. Could also be related to this issue: https://github.com/census-instrumentation/opencensus-python/issues/928

rockTA commented 2 years ago

For anyone trying to get this to work, a slightly remodeled version of this did it for me as a workaround (obviously using AzureLogHandler instead of RotatingFileHandler).

ManashJKonwar commented 2 years ago

@rockTA Would you mind sharing the remodeled version of AzureLogHandler. i tried exploring the setup example which you porvided, however the logs are only getting printed into the console and are not getting sent to Azure App Insights. I have tried using AzureLogHandler, AzureEventHandler as well. Hitting a roadblock here. Can you please share your solution, it would surely help a lot?

Ceglowa commented 2 years ago

I have the same problem with AzureLogHandler. The Celery workers are outputting the logs but they are not sent do Azure Application Insights. Did anyone find any solution to it? Thanks in advance

greatvovan commented 1 year ago

Probably caused by #1190. I would appreciate if someone tried the patch and reported.

ManashJKonwar commented 1 year ago

For me @Ceglowa, we solved it by using a previous version of Opencensus Azure Monitor (opencensus-ext-azure==1.1.3 # https://pypi.org/project/opencensus-ext-azure/1.1.3/). We also bumped into a similar problem with 1.1.4 version where the logs were written using the handler but were not getting transferred to Azure App Insights. When we deep dive into the issue, we found that the issue was there because of improper thread closure at the logger end. Solution / Patch referred by @greatvovan would be the substantial and long-term best solution. But it's up to the opencensus team to take this into account and our work cannot wait that long right? You may also try using the latest version, I have not experimented with the same yet.

jackeverydayzero commented 1 year ago

Also facing this issue.

kevinco26 commented 10 months ago

Probably caused by #1190. I would appreciate if someone tried the patch and reported.

I tried your solution but didn't work :(, I still don't see the logs in application insights @greatvovan

Saugat168 commented 1 week ago

For anyone trying to get this to work, a slightly remodeled version of this did it for me as a workaround (obviously using AzureLogHandler instead of RotatingFileHandler).

This worked for me (as it did for @rockTA ), using the latest opencencus package ! This method is also mentioned here in python docs https://docs.python.org/3/howto/logging-cookbook.html#logging-to-a-single-file-from-multiple-processes. Basically starting the listener and then configuring the worker on the celery signal using the setup_logging signal.

@setup_logging.connect def on_setup_logging(**kwargs): worker_configurer(queue=queue)