aio-libs / aiohttp

Asynchronous HTTP client/server framework for asyncio and Python
https://docs.aiohttp.org
Other
15.05k stars 2.01k forks source link

Gunicorn and StatsD metrics #3777

Open Gr1N opened 5 years ago

Gr1N commented 5 years ago

Long story short

Hi,

Gunicorn provides an optional instrumentation of the arbiter and workers using the StatsD protocol. Details can be found at https://docs.gunicorn.org/en/latest/instrumentation.html

But there is no way to get metrics because we need to use worker class from AIOHTTP (aiohttp.GunicornWebWorker) and it does not use access method of the logger (https://github.com/benoitc/gunicorn/blob/master/gunicorn/instrument/statsd.py#L90). It's uses info (https://github.com/aio-libs/aiohttp/blob/master/aiohttp/web_log.py#L233) for access logs.

I'm very interested in that feature and I like to help with development. I see two ways of implementation:

  1. Somehow refactor current code-base to use access method of the logger.
  2. Provide optional instrumentation logic using the StatsD protocol in aiohttp.GunicornWebWorker, something like Gunicorn already have.

Expected behaviour

Run application using Gunicorn and specify StatsD configuration parameters:

$ gunicorn -w 4 -b "0.0.0.0:4000" --worker-class=aiohttp.GunicornWebWorker --statsd-host="0.0.0.0:9125" --statsd-prefix=aiohttp app_aiohttp:get_application

Check StastD logs and see metrics with names gunicorn.requests and gunicorn.request.duration.

Actual behaviour

Run application using Gunicorn and specify StatsD configuration parameters:

$ gunicorn -w 4 -b "0.0.0.0:4000" --worker-class=aiohttp.GunicornWebWorker --statsd-host="0.0.0.0:9125" --statsd-prefix=aiohttp app_aiohttp:get_application

Check StastD logs. You can see metric with name gunicorn.workers, but not with names gunicorn.requests and gunicorn.request.duration.

Your environment

MacOS 10.14.4 aiohttp 3.5.4 gunicorn 19.9.0

asvetlov commented 5 years ago

Sorry, I don't know gunicorn instrumentation details deep enough.

What method is better?

Gr1N commented 5 years ago

To my mind, a huge refactoring is always a bad idea, so option one sounds bad to me. Also, I don't think that AIOHTTP wants to have StatsD related logic inside the project.

But I did more research on code-base and I think that a better solution for me is to create my own worker class based on aiohttp.GunicornWebWorker and pass extended logger class with all StatsD logic I need. And it looks simple, only I need to do is to pass access_logger_class into AppRunner. But I don't want to copy whole _run method just for passing access_logger_class and if it's possible to create a new method named _create_app_runner it will solve all my problems I think.

@asvetlov what do you think?

asvetlov commented 5 years ago

If what we need is calling gunicorn logger.access() for access log -- let's create own version of AbstractAccessLogger in worker.py and use it for gunicorn workers. Can it work?

Gr1N commented 5 years ago

I guess yes. I'll try this option.

Gr1N commented 5 years ago

I tried and I don't like it, at least because I can't find a way how to make a backward compatible change.

I still thinking about a small change, something like:

diff --git a/aiohttp/worker.py b/aiohttp/worker.py
index 73ba6e38..ee96a483 100644
--- a/aiohttp/worker.py
+++ b/aiohttp/worker.py
@@ -6,13 +6,14 @@ import re
 import signal
 import sys
 from types import FrameType
-from typing import Any, Awaitable, Callable, Optional, Union  # noqa
+from typing import Any, Awaitable, Callable, Optional, Type, Union  # noqa

 from gunicorn.config import AccessLogFormat as GunicornAccessLogFormat
 from gunicorn.workers import base

 from aiohttp import web

+from .abc import AbstractAccessLogger
 from .helpers import set_result
 from .web_app import Application
 from .web_log import AccessLogger
@@ -34,6 +35,7 @@ class GunicornWebWorker(base.Worker):

     DEFAULT_AIOHTTP_LOG_FORMAT = AccessLogger.LOG_FORMAT
     DEFAULT_GUNICORN_LOG_FORMAT = GunicornAccessLogFormat.default
+    DEFAULT_ACCESS_LOGGER_CLASS = AccessLogger  # type: Type[AbstractAccessLogger]

     def __init__(self, *args: Any, **kw: Any) -> None:  # pragma: no cover
         super().__init__(*args, **kw)
@@ -78,6 +80,7 @@ class GunicornWebWorker(base.Worker):
                                logger=self.log,
                                keepalive_timeout=self.cfg.keepalive,
                                access_log=access_log,
+                               access_log_class=self.DEFAULT_ACCESS_LOGGER_CLASS,
                                access_log_format=self._get_valid_log_format(
                                    self.cfg.access_log_format))
         await runner.setup()

With that change we can easylly create our own worker class and pass any compatible logger class with all logic we need. @asvetlov what do you think?

asvetlov commented 5 years ago

I think we have a bad design decision in access logging support :) Take a look at #3795 for the problem description and proposed solution

Gr1N commented 5 years ago

Look interesting, but what about cases with Gunicorn workers? As I see there is no simple way to pass custom logger class or somehow override access logger in case of Gunicorn workers, for now, and I can't see anything about that in the proposed solution. Or maybe I don't know or don't understand something?

asvetlov commented 5 years ago

gunicorn workers use web.AppRunner that can be configured to pass a custom access logger class.

3795 is about adding the ability to pass not a class but the class instance.

Gr1N commented 5 years ago

Yes, I can pass logger class into web.AppRunner, but in the case of Gunicorn workers I can do it only by copying a huge amount of code because web.AppRunner initialized in the middle of _run method. And earlier I proposed a simple (maybe bad) solution on how to fix that.

LeartS commented 4 years ago

Hey people, any news on this? I'm having the issue reported here of statsd requests metric not being sent when using aiohttp.GunicornWebWorker

Gr1N commented 4 years ago

@LeartS maybe I can help you in some way. I wrote a simple instrumentation library (https://github.com/Gr1N/aiodogstatsd) and middleware for AIOHTTP for metrics collection on the application level. It's not the same functionality but pretty similar.