census-instrumentation / opencensus-python

A stats collection and distributed tracing framework
Apache License 2.0
669 stars 250 forks source link

Django server is not running with AzureLogHandler #904

Closed RomainDLS closed 3 years ago

RomainDLS commented 4 years ago

Describe your environment.

Python 3.7.6
Django==2.2.13
django-cors-headers==3.2.0
django-extensions==2.2.5
django-filter==2.2.0
django-rest-swagger==0.3.10
django-sslserver==0.22
django-treebeard==4.3
djangorestframework==3.10.3
djangorestframework-csv==2.1.0
djangorestframework-jwt==1.11.0

opencensus==0.7.7
opencensus-context==0.1.1
opencensus-ext-azure==1.0.2
opencensus-ext-django==0.7.2

Steps to reproduce. Add AzureLogHandler to Django LOGGING config in order to export all logs to Azure : Django LOGGING config :


LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'formatters': {
        'verbose': {
            'format': '%(asctime)s %(levelname).3s %(process)d %(name)s : %(message)s',
        },
        'simple': {
            'format': '%(asctime)s %(levelname)-7s : %(message)s',
        },
    },
    'filters': {
        'require_debug_false': {
            '()': 'django.utils.log.RequireDebugFalse',
        }
    },
    'handlers': {
        'console': {
            'class': 'logging.StreamHandler',
            'formatter': 'verbose',
        },
        'azure': {
            'formatter': 'simple',
            'class': 'opencensus.ext.azure.log_exporter.AzureLogHandler',
            'connection_string': 'InstrumentationKey=XXXX-XXXX-XXXX-XXXX'
        },
        'mail_admins': {
            'level': 'ERROR',
            'filters': ['require_debug_false'],
            'class': 'django.utils.log.AdminEmailHandler'
        }
    },
    'loggers': {
        '': {
            'level': os.environ.get('LOGLEVEL', 'INFO'),
            'handlers': ['console', 'azure'],
        },
        'devdebug': {
            'handlers': ['console'],
            'level': 'INFO',
            'propagate': False,
        },
        'django': {
            'handlers': ['console', 'mail_admins'],
            'level': os.environ.get('LOGLEVEL', 'INFO'),
            'propagate': False,
        }
    },
}

What is the expected behavior? When running manage.py runsslserver localhost:53215, the server should run and must be accessible :

curl -v http://127.0.0.1:53215/
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 53215 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:53215
> User-Agent: curl/7.58.0
> Accept: */*
> 
* Recv failure: Connection reset by peer
* stopped the pause stream!
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer

What is the actual behavior? When running manage.py runsslserver localhost:53215, the server run (without error log) but is not accessible :

curl -v http://127.0.0.1:53215/
*   Trying 127.0.0.1...
* TCP_NODELAY set
* connect to 127.0.0.1 port 53215 failed: Connection refused
* Failed to connect to 127.0.0.1 port 53215: Connection refused
* Closing connection 0
curl: (7) Failed to connect to 127.0.0.1 port 53215: Connection refused
lzchen commented 4 years ago

@RomainDLS If you remove the AzureLogHandler do you still get the same errors?

RomainDLS commented 4 years ago

No, without AzureHandler the server works.

lzchen commented 4 years ago

@RomainDLS I ran your code exactly with a very simple django app and it works for me. I am able to hit localhost:53215 and it displays the page correctly, and sends the related telemetry. Are you able to share your django app?

EDIT: I am also hitting the endpoint using the browser.

RomainDLS commented 4 years ago

@Izchen Thanks for your answer. Unfortunately I am not able to share you my django app as it isn't mine. However I can give more information about my environnement : We are using cloudant (NoSQL database), celery (message queue system). Here is my full requirements :

# Base Django
dj-database-url==0.5.0
django-extensions==2.2.5
django-filter==2.2.0
django-treebeard==4.3.0
django==2.2.13
djangorestframework-csv==2.1.0
djangorestframework==3.10.*

# Services (celery & cloudant)
celery==v4.4.0
cloudant==2.12.0
kombu==4.6.7
paho-mqtt==1.5.0
paramiko==2.7.1

# Utils
adal==1.2.2
construct==2.8.12
dicttoxml==1.7.4
gevent==1.4.0
isodate==0.6.0
jsonschema==3.2.0
numpy==1.18.0
pandas==0.25.3
pdfkit==0.6.1
psycopg2-binary==2.8.4
PyJWT==1.7.1
python-dateutil==2.8.1
pytz==2019.3
requests-oauthlib==1.3.0
requests==2.22.0
schema==0.7.1
seaborn==0.9.0
simplejson==3.17.0
twilio==6.35.0
Unidecode==1.1.1
zeep==3.4.0
croniter==0.3.23
cryptography==2.8
python3-saml==1.9.0
whitenoise==4.1.4

# Server
django-cors-headers==3.2.0
drf-yasg==1.17.0
drf_tweaks==0.9.3
djangorestframework_jwt==1.*
gunicorn==20.0.4
social-auth-app-django==3.1.0
social-auth-core==3.2.0
lzchen commented 4 years ago

@RomainDLS It will be difficult for me to help you without being able to see the actual executable code. My suggestion is to either start with the most basic project and gradually add the libraries that you are using, or iteratively remove the libraries you are using from your application and hit the endpoint consistently to see which library is the culprit.

EDIT: Also it could also just be some configuration in your Django app itself. It is hard to tell without having access to it.

kstreith commented 4 years ago

I'm having the same issue. The problem occurs on Python 3.7, specifically using the following Docker image (python:3.7-stretch). If we go back to the Python 3.6 version of the same image we don't have the problem. If we run either version of Python on Windows, we also don't have the problem. Hopefully this helps @lzchen . @RomainDLS can you provide operation system details for your project?

CBurini commented 4 years ago

@lzchen I can confirm there's an issue with the AzureLogHandler when using Django's runserver. We had the same problem within a project. I've been able to reproduce the issue with the most basic django project (./manage.py startproject) with:

requirements.txt:

Django==2.2.14
opencensus-ext-azure==1.0.4

logging config in project settings:

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'handlers': {
        'azure': {
            'level': 'INFO',
            'class': 'opencensus.ext.azure.log_exporter.AzureLogHandler',
            'instrumentation_key': '....',
        },
    },
    'loggers': {
        'django': {
            'handlers': ['azure'],
            'level': 'INFO',
        },
    }
}

With that, if you launch the runserver (./manage.py runserver), the runserver will hang forever, and not respond to anything.

I've tried debugging a bit, it seems it's stuck when trying to acquire a lock in the logging lib. Here's the stacktrace:

File "/usr/lib/python3.8/threading.py", line 890, in _bootstrap
  self._bootstrap_inner()
File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
  self.run()
File "/home/cesar/Documents/Dev/Test/env/lib/python3.8/site-packages/opencensus/ext/azure/log_exporter/__init__.py", line 118, in run
  dst._export(batch[:-1], event=batch[-1])
File "/home/cesar/Documents/Dev/Test/env/lib/python3.8/site-packages/opencensus/ext/azure/log_exporter/__init__.py", line 67, in _export
  result = self._transmit(envelopes)
File "/home/cesar/Documents/Dev/Test/env/lib/python3.8/site-packages/opencensus/ext/azure/common/transport.py", line 47, in _transmit
  response = requests.post(
File "/home/cesar/Documents/Dev/Test/env/lib/python3.8/site-packages/requests/api.py", line 119, in post
  return request('post', url, data=data, json=json, **kwargs)
File "/home/cesar/Documents/Dev/Test/env/lib/python3.8/site-packages/requests/api.py", line 61, in request
  return session.request(method=method, url=url, **kwargs)
File "/home/cesar/Documents/Dev/Test/env/lib/python3.8/site-packages/requests/sessions.py", line 530, in request
  resp = self.send(prep, **send_kwargs)
File "/home/cesar/Documents/Dev/Test/env/lib/python3.8/site-packages/requests/sessions.py", line 643, in send
  r = adapter.send(request, **kwargs)
File "/home/cesar/Documents/Dev/Test/env/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
  resp = conn.urlopen(
File "/home/cesar/Documents/Dev/Test/env/lib/python3.8/site-packages/urllib3/connectionpool.py", line 659, in urlopen
  conn = self._get_conn(timeout=pool_timeout)
File "/home/cesar/Documents/Dev/Test/env/lib/python3.8/site-packages/urllib3/connectionpool.py", line 279, in _get_conn
  return conn or self._new_conn()
File "/home/cesar/Documents/Dev/Test/env/lib/python3.8/site-packages/urllib3/connectionpool.py", line 939, in _new_conn
  log.debug(
File "/usr/lib/python3.8/logging/__init__.py", line 1428, in debug
  if self.isEnabledFor(DEBUG):
File "/usr/lib/python3.8/logging/__init__.py", line 1694, in isEnabledFor
  _acquireLock()

A few notes:

    'loggers': {
        'test': {
            'handlers': ['azure'],
            'level': 'INFO',
        },
    }

It will work again. Probably because Django's 'early' INFO logs in the runserver were not send to azure. But if you add a basic app, like polls in the Django tutorials, and put the following code in the init.py of the polls folder:

import logging
logging.getLogger('test').warning('hello')

The runserver will get stuck again. If you remove this, and put 'regular' logs in the polls app, it will be ok. It really looks like it's due to emitting logs to Azure during the initialization of Django's runserver.

I hope this helps !

lzchen commented 4 years ago

@CBurini Thanks for looking into this! Curious, when you say runserver hangs, does that mean after executing python manage.py runserver the command line does not display anything?

Also I'm wondering if this is also the cause of the original issue. @RomainDLS are you getting your django server hanging as well? Or are you simply not able to hit the endpoint with manage.py runsslserver localhost:53215 ? If different, it might be a a different issue.

lzchen commented 4 years ago

@CBurini I am also curious if this issue is still present for you if you instrument with the AzureLogHandler directly in your code, instead of through django's settings?

lzchen commented 4 years ago

@CBurini I was able to reproduce the bug, thanks to your help. It does seem to be caused by the Django server using the root logger to log messages such as Watching for file changes with StatReloader and the reloading prompts. With this being the case, may I suggest using a custom logger created that can act as a "root" logger? Simply configure it with a specific name and always use that when accessing a logger in your python scripts.

    'loggers': {
        'test': {
            'handlers': ['azure'],
            'level': 'INFO',
        },
    }

And then in your code

logger = logging.getLogger("test")

CBurini commented 4 years ago

Thanks for looking at it! Indeed, that's the kind of the temporary trick I had to use to avoid the bug (= preventing Django's runserver's logs to be handled by Azure's handler). I actually used Django's filter django.utils.log.RequireDebugFalse on the Azure handler:

The logging dict looks like this:

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'filters': {
        'require_debug_false': {
            '()': 'django.utils.log.RequireDebugFalse',
        },
    },
    'handlers': {
        'console': {
            'level': 'DEBUG',
            'class': 'logging.StreamHandler',
        },
        'azure': {
            'level': 'INFO',
            'class': 'opencensus.ext.azure.log_exporter.AzureLogHandler',
            'instrumentation_key': <...>,
            'filters': ['require_debug_false']
        },
    },
    'root': {
        'handlers': ['console', 'azure'],
        'level': 'WARNING',
    },
    'loggers': {
        'project': {
            'handlers': ['console', 'azure'],
            'level': 'DEBUG',
            'propagate': False,
        },
        'django': {
            'handlers': ['console', 'azure'],
            'level': 'INFO',
            'propagate': False,
        },
    }
}
Shivappa commented 3 years ago

Even I am facing same issue. Any update on this ticket ?

If I remove 'azure' from root logger then logging works but my application logs are not exporting to Azure monitor. If I use 'log_exporter' in logging.getLogger('log_exporter'), only I get that module logs but other thirdparty logs will not be dumped.


   'loggers': {
        '': {
            'handlers': [
                'azure', 
                'file',
            ],
        },
        'log_exporter': { # <-- log_exporter  logger
            "handlers": [
                'azure',
                ],
            'propagate': False,
        },
lzchen commented 3 years ago

@Shivappa Have you tried setting django.utils.log.RequireDebugFalse as mentioned above?

Shivappa commented 3 years ago

@lzchen
Thanks for the suggestion. Setting django.utils.log.RequireDebugFalse makes django server runs fine. But my logs are not exporting to azure app insights. Any other settings I need to make.

    'handlers': {
        'azure':{
            'level': 'INFO',
            'class': 'opencensus.ext.azure.log_exporter.AzureLogHandler',
            "instrumentation_key": INSTRUMENTATION_KEY,
            'formatter': 'verbose',
            'filters': ['require_debug_false']
        },
    },

    'loggers': {
        '': {
            'handlers': [
                'azure', 
                'file',
            ],
            'propagate': False,
        },
lzchen commented 3 years ago

@Shivappa Do you have a code snippet you can share?

Shivappa commented 3 years ago

@lzchen Here is my code snippet. ---settings.py---

MIDDLEWARE = [
 ...........
    'opencensus.ext.django.middleware.OpencensusMiddleware',
]
LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'filters': {
        "require_debug_false": {
            '()': 'django.utils.log.RequireDebugFalse',
        },
    },
    'formatters': {
        'verbose': {
            'format': '[%(asctime)s] %(levelname)s [%(name)s.%(funcName)s:%(lineno)d] %(message)s',
        'datefmt': '%Y-%m-%d %H:%M:%S'
        },
    },
    'handlers': {
        'file': {
            'level': 'INFO',
            'class': 'logging.FileHandler',
            'filename': os.path.join(BASE_DIR, 'logs', 'app.log'),
            'formatter': 'verbose'
        },
        'azure':{
            'level': 'INFO',
            'class': 'opencensus.ext.azure.log_exporter.AzureLogHandler',
            "instrumentation_key": INSTRUMENTATION_KEY,
            'formatter': 'verbose',
            'filters': ['require_debug_false']
        },
    },
    'loggers': {
        '': {
            'handlers': [
                'azure', 
                'file',
            ],
            'propagate': False,
        },
        'exporter': {
            "handlers": [
                'azure',
                ],
            'propagate': False,
        },
        'django': { # <-- django app logger
            'handlers': [
                'file',
                'azure',
                ],
            # 'propagate': False,
        },
        'py.warnings': {
            'handlers': [
                'file',
                'azure',
                ],
            'propagate': False,
        },
    },
}
sampler = 'opencensus.trace.samplers.ProbabilitySampler(rate=1.0)'
OPENCENSUS = {
    'TRACE': {
        'SAMPLER': sampler,
        'EXPORTER': "opencensus.ext.azure.trace_exporter.AzureExporter(connection_string='"
        + APPLICATIONINSIGHTS_CONNECTION_STRING + "')",
    }
}

---views.py---

logger = logging.getLogger(__name__)
logger.info('test app insight logging')
lzchen commented 3 years ago

You are able to see the logs resulting from your FileHandler?

Can we try to isolate the issue? Please remove everything not related to logging with the AzureLogHandler in settings.py and see if that works.

Shivappa commented 3 years ago

@lzchen Yes, I am able to see the logs in app.log after removing the AzureLogHandler in settings.py Below is the snippet for file logging.

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'filters': {
        "require_debug_false": {
            '()': 'django.utils.log.RequireDebugFalse',
        },
        'require_debug_true': {
            '()': 'django.utils.log.RequireDebugTrue',
        },
    },
    'formatters': {
        'simple': {
            'format': '[%(asctime)s] %(levelname)s %(message)s',
        'datefmt': '%Y-%m-%d %H:%M:%S'
        },
        'verbose': {
            'format': '[%(asctime)s] %(levelname)s [%(name)s.%(funcName)s:%(lineno)d] %(message)s',
        'datefmt': '%Y-%m-%d %H:%M:%S'
        },
    },

    'handlers': {
        'file': {
            'level': 'INFO',
            'class': 'logging.FileHandler',
            'filename': os.path.join(BASE_DIR, 'logs', 'app.log'),
            'formatter': 'verbose'
        },
    },
    'loggers': {
        '': {
            'handlers': [
                'file',
            ],
            'level': 'INFO',
        },
    },
}
lzchen commented 3 years ago

@Shivappa I've been doing some testing and the require_debug_false filters have not been working for me. Try removing the filters and see what happens.

sdementen commented 3 years ago

any update on this ? also facing the issue of django runserver blocked

lzchen commented 3 years ago

@sdementen Are you facing the exact same symptoms as above?

sdementen commented 3 years ago

Yes, the server is not responding to requests from the browser (python 3.8 on windows with Django runserver)

lzchen commented 3 years ago

@sdementen Have you tried some of the workarounds listed above?

sdementen commented 3 years ago

I want my local server to send logs to azure too. I haven't seen a workaround for that...

lzchen commented 3 years ago

@sdementen I believe the above workarounds are run using a local server.

sdementen commented 3 years ago

I only see a fix with the filter require_debug_false which would then avoid sending logs in debug mode while I want to send them in debug mode. but I may miss another workaround...

lzchen commented 3 years ago

@sdementen It's difficult to know exactly what your use case is without a full description. Can you open up another issue?

lzchen commented 3 years ago

Closing this thread due to inactivity from previous conversations.

sdementen commented 3 years ago

I have the exact same issue as the one described here and for which there is no solution. I have switched back to the old appinsights SDK which works without issues. Anyway, thank you for your support on this problem

tibnor commented 3 years ago

My workaround is to set 'disable_existing_loggers': True,

There seems to be a deadlock. Where azureloghandler sends an message with requests, which uses urllib3. Urllib3 tries to log this event, but when checking logging level it waits on the lock of logging to be released. At the same time the main thread is configuring logging and it locks during the close of previous loggers.