bentoml / BentoML

The easiest way to serve AI apps and models - Build reliable Inference APIs, LLM apps, Multi-model chains, RAG service, and much more!
https://bentoml.com
Apache License 2.0
7.07k stars 783 forks source link

feature: Allow "backupCount" to be specified in environment variable for the default monitoring log's TimedRotatingFileHandler #4384

Open rcorujo opened 9 months ago

rcorujo commented 9 months ago

Feature request

The default log configuration file in ./src/bentoml/_internal/monitoring/default.py uses a TimedRotatingFileHandler which specifies a daily rotation (i.e., when: "D").

DEFAULT_CONFIG_YAML = """
version: 1
disable_existing_loggers: false
loggers:
  bentoml_monitor_data:
    level: INFO
    handlers: [bentoml_monitor_data]
    propagate: false
  bentoml_monitor_schema:
    level: INFO
    handlers: [bentoml_monitor_schema]
    propagate: false
handlers:
  bentoml_monitor_data:
    class: logging.handlers.TimedRotatingFileHandler
    level: INFO
    formatter: bentoml_json
    filename: '{data_filename}'
    when: "D"
  bentoml_monitor_schema:
    class: logging.handlers.RotatingFileHandler
    level: INFO
    formatter: bentoml_json
    filename: '{schema_filename}'
formatters:
  bentoml_json:
    class: pythonjsonlogger.jsonlogger.JsonFormatter
    format: "()"
    validate: false
"""

While this will rotate the log file daily, it will keep the old rotated files forever, which may eventually fill up the disk.

Since the BentoML default log configuration does not specify a backupCount, then, according to the Python logging documentation for TimedRotatingFileHandler (logging.handlers — Logging handlers ), the default value for backupCount is 0.

The Python logging documentation for RotatingFileHandler (logging.handlers — Logging handlers ) says that when backupCount is 0, then the rollover never occurs.

You can use the maxBytes and backupCount values to allow the file to rollover at a predetermined size. When the size is about to be exceeded, the file is closed and a new file is silently opened for output. Rollover occurs whenever the current log file is nearly maxBytes in length; but if either of maxBytes or backupCount is zero, rollover never occurs, so you generally want to set backupCount to at least 1, and have a non-zero maxBytes. When backupCount is non-zero, the system will save old log files by appending the extensions ‘.1’, ‘.2’ etc., to the filename. For example, with a backupCount of 5 and a base file name of app.log, you would get app.log, app.log.1, app.log.2, up to app.log.5. The file being written to is always app.log. When this file is filled, it is closed and renamed to app.log.1, and if files app.log.1, app.log.2, etc. exist, then they are renamed to app.log.2, app.log.3 etc. respectively.

To illustrate, consider this Python script, which I'll call rotate_test.py. It rotates the file every second.

import logging
from logging.handlers import TimedRotatingFileHandler

import time

logHandler = TimedRotatingFileHandler("logfile",when="S",interval=1)
logFormatter = logging.Formatter('%(asctime)s %(message)s')
logHandler.setFormatter( logFormatter )
logger = logging.getLogger( 'MyLogger' )
logger.addHandler( logHandler )
logger.setLevel( logging.INFO )

for k in range(10):
    logger.info("Line %d" %  k)
    time.sleep(1)

We can see that when we run the script for 10 seconds, there are 10 log files.

$ ls
rotate_test.py

$ python rotate_test.py

$ ls -l
total 88
-rw-r--r--  1 rcorujo  staff   31 Jan  3 10:36 logfile
-rw-r--r--  1 rcorujo  staff   31 Jan  3 10:36 logfile.2024-01-03_10-36-29
-rw-r--r--  1 rcorujo  staff   31 Jan  3 10:36 logfile.2024-01-03_10-36-30
-rw-r--r--  1 rcorujo  staff   31 Jan  3 10:36 logfile.2024-01-03_10-36-31
-rw-r--r--  1 rcorujo  staff   31 Jan  3 10:36 logfile.2024-01-03_10-36-32
-rw-r--r--  1 rcorujo  staff   31 Jan  3 10:36 logfile.2024-01-03_10-36-33
-rw-r--r--  1 rcorujo  staff   31 Jan  3 10:36 logfile.2024-01-03_10-36-34
-rw-r--r--  1 rcorujo  staff   31 Jan  3 10:36 logfile.2024-01-03_10-36-35
-rw-r--r--  1 rcorujo  staff   31 Jan  3 10:36 logfile.2024-01-03_10-36-36
-rw-r--r--  1 rcorujo  staff   31 Jan  3 10:36 logfile.2024-01-03_10-36-37
-rw-r--r--  1 rcorujo  staff  428 Jan  3 10:34 rotate_test.py

Now let's add a backupCount=2 to TimedRotatingFileHandler.

import logging
from logging.handlers import TimedRotatingFileHandler

import time

logHandler = TimedRotatingFileHandler("logfile",when="S",interval=1,backupCount=2)
logFormatter = logging.Formatter('%(asctime)s %(message)s')
logHandler.setFormatter( logFormatter )
logger = logging.getLogger( 'MyLogger' )
logger.addHandler( logHandler )
logger.setLevel( logging.INFO )

for k in range(10):
    logger.info("Line %d" %  k)
    time.sleep(1)

We can see that when we run the script for 10 seconds, there are now only 3 log files.

$ rm logfile*
$ python rotate_test.py
$ ls -l
total 32
-rw-r--r--  1 rcorujo  staff   31 Jan  3 10:38 logfile
-rw-r--r--  1 rcorujo  staff   31 Jan  3 10:38 logfile.2024-01-03_10-38-53
-rw-r--r--  1 rcorujo  staff   31 Jan  3 10:38 logfile.2024-01-03_10-38-54
-rw-r--r--  1 rcorujo  staff  442 Jan  3 10:37 rotate_test.py

It would be nice to be able to specify the backupCount in an environment variable so that when the bento runs in a container, we can pass the environment variable to the container.

Motivation

When running a Kserve inference service using BentoML, it is desirable to not have to create a separate log configuration file to use in the container, and simply use the default values. Having the ability to specify additional parameters to TimedRotatingFileHandler in BentoML environment variables would only require Kserve to set the environment variable, and would eliminate the need for each user to create a custom configuration file to set additional parameters to TimedRotatingFileHandler.

The motivation is really not to run out of disk space in the container because the BentoML monitoring logs never get cleaned up. Hopefully, the log file rotation will play well with the promtail that is scraping the logs and sending them to loki.

Other

No response

aarnphm commented 9 months ago

@bojiang on this one