python / cpython

The Python programming language
https://www.python.org
Other
62.27k stars 29.92k forks source link

TimedRotatingFileHandler rotating on use not time #84649

Open c8859649-c2df-4077-a463-b240f3c1d016 opened 4 years ago

c8859649-c2df-4077-a463-b240f3c1d016 commented 4 years ago
BPO 40469
Nosy @vsajip, @stevendaprano, @martonivan
PRs
  • python/cpython#24660
  • Files
  • logging_config.ymal
  • app.py: Eg code to use the ymal file
  • config.json: JSON version of test config
  • log_40469.py: Modified test script
  • log_40469_new_instance.py
  • log_40469_single.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['3.8'] title = 'TimedRotatingFileHandler rotating on use not time' updated_at = user = 'https://bugs.python.org/markhallett' ``` bugs.python.org fields: ```python activity = actor = 'steven.daprano' assignee = 'none' closed = False closed_date = None closer = None components = [] creation = creator = 'markhallett' dependencies = [] files = ['49111', '49112', '50196', '50197', '50208', '50209'] hgrepos = [] issue_num = 40469 keywords = [] message_count = 11.0 messages = ['367849', '367922', '398684', '399306', '399481', '399496', '399499', '399500', '399506', '399531', '416458'] nosy_count = 4.0 nosy_names = ['vinay.sajip', 'steven.daprano', 'markhallett', 'martonivan'] pr_nums = ['24660'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = None url = 'https://bugs.python.org/issue40469' versions = ['Python 3.8'] ```

    c8859649-c2df-4077-a463-b240f3c1d016 commented 4 years ago

    Using the attached confing ymal file, the log file only rolls if the log file is not written to for over a minute, logging regulary (eg every 30s) prevents the correct rolling of the log.

    c8859649-c2df-4077-a463-b240f3c1d016 commented 4 years ago

    Eg code to run is now attached

    vsajip commented 3 years ago

    With this slightly modified test script and the JSON configuration based on the YAML, I got the following results after running for a while:

    $ more info.log*
    ::::::::::::::
    info.log
    ::::::::::::::
    2021-08-01 07:13:45 INFO Message no. 9
    2021-08-01 07:14:15 INFO Message no. 10
    ::::::::::::::
    info.log.2021-08-01_07-09
    ::::::::::::::
    2021-08-01 07:09:45 INFO Message no. 1
    2021-08-01 07:10:15 INFO Message no. 2
    ::::::::::::::
    info.log.2021-08-01_07-10
    ::::::::::::::
    2021-08-01 07:10:45 INFO Message no. 3
    2021-08-01 07:11:15 INFO Message no. 4
    ::::::::::::::
    info.log.2021-08-01_07-11
    ::::::::::::::
    2021-08-01 07:11:45 INFO Message no. 5
    2021-08-01 07:12:15 INFO Message no. 6
    ::::::::::::::
    info.log.2021-08-01_07-12
    ::::::::::::::
    2021-08-01 07:12:45 INFO Message no. 7
    2021-08-01 07:13:15 INFO Message no. 8

    The test script logs every 30 seconds, so I don't see what the problem is. Can you describe it more clearly?

    6ab1e8c7-7c37-41d4-b37d-f2ad1328ad90 commented 3 years ago

    The assumed behaviour of TimedRotatingFileHandler is to rotate log files older than configured. Even when the script is executed multiple times.

    self.rolloverAt (the variable that defines when the rollover should be performed) is set after each rollover and right after initializing the file handler instance.

    If the instance is initialized once (like in your script) and a rollover is performed without having the object destroyed, the logger works fine, the rollover is done and the next round is scheduled properly.

    The case is not so lucky if the script runs multiple time or the logger itself is initialized multiple times during one execution. In this case, since the MTIME is read each time when the TimedRotatingFileHandler's init is called, and the file is modified (by having a new line added to the end of the file). The next execution will read the new MTIME and will never perform any rollover.

    I've slightly modified your example script to demonstrate these use-cases.

    Example 1: Log a single line with the script, but execute it multiple times with delays between the execution! In bash: for i in {1..101}; do python log_40469_single.py $i; sleep 30; done

    Example 2: Log multiple lines, but reinitiate the logger object between the events! See log_40469_new_instance.py

    vsajip commented 3 years ago

    The assumed behaviour of TimedRotatingFileHandler is to rotate log files older than configured. Even when the script is executed multiple times.

    Ah, but is it? The purpose of TimedRotatingFileHandler is to rotate files based on time intervals, and is normally assumed to be for long-running processes (where the rotation allows one to focus on logs output during a period of interest). What's the practical, real-world use-case for the example scripts you added?

    6ab1e8c7-7c37-41d4-b37d-f2ad1328ad90 commented 3 years ago

    Well, there is nothing in the documentation that would say the logrotation can be only performed by continously running scripts, thats why. :)

    There are scripts or even daemons written in Python that need to be restarted from time to time. If the logrotate period is greater than restart period the log files will never be rotated, however they could be.

    We were looking for something that works like the logrotate do, but without involving any external logic, but the one written in python. And TimedLogrotateHandler appears to be one of those, unless it checks the MTIME of the files instead of the CTIME.

    What is the logic behind checking the MTIME instead of CTIME when one is calculating the age of a file when tries to calculate whether it has to be rotated or not? If the only purpose of this initial rollover calculation would be to start a predefined long period (as you suggest), now() (time of execution) should used instead and no file attribute should be involved. From the code I have to assume that the original author also tried to prepare for the cases where the execution is interrupted and/or restarted and (s)he wanted the TimedLogrotateHandler to be resumed where it was during the previous execution. Dnn't you agree?

    vsajip commented 3 years ago

    Well, there is nothing in the documentation that would say the logrotation can be only performed by continously running scripts, thats why. :)

    Well, what's the need to rotate based on time when you just run scripts sporadically or they're very short-lived? Not trying to argue, just trying to understand.

    There are scripts or even daemons written in Python that need to be restarted from time to time. If the logrotate period is greater than restart period the log files will never be rotated, however they could be.

    Yes, but generally daemons are long-lived relative to the rotation interval, aren't they? "From time to time" certainly implies that to me.

    What is the logic behind checking the MTIME instead of CTIME when one is calculating the age of a file when tries to calculate whether it has to be rotated or not?

    I can't remember, that code is over 12 years old and that specific decision wasn't documented :-(

    From the code I have to assume that the original author also tried to prepare for the cases where the execution is interrupted and/or restarted and (s)he wanted the TimedLogrotateHandler to be resumed where it was during the previous execution. Dnn't you agree?

    I take it you mean TimedRotatingFileHandler. Certainly it assumes that the script might be restarted, which is why it opens the file in append mode.

    If the computation using MTIME were changed to use CTIME, that would be a behaviour change, which could conceivably break someone's code that relied on current behaviour (unlikely, but you just never know). I've no other objection to using CTIME rather than MTIME - perhaps I'm just being hyper-cautious about breakage?

    vsajip commented 3 years ago

    I just checked - the relevant code (using MTIME) was added in 2010, as a fix for bpo-8117 - around the time that Python 2.6 was released.

    6ab1e8c7-7c37-41d4-b37d-f2ad1328ad90 commented 3 years ago

    Well, what's the need to rotate based on time when you just run scripts sporadically or they're very short-lived? Not trying to argue, just trying to understand. Yes, but generally daemons are long-lived relative to the rotation interval, aren't they? "From time to time" certainly implies that to me.

    In one of our use-cases, where I faced with this bug/behaviour of TimedRotatingFileHandler, we are using it in a daemon for daily logrotate. The aim is to have the logs separated day by day, just like the majority of our other components running on a linux. We've just realized that some days are missing from the series of logs and the investigation led to the service restart and the MTIME checking that we are talking about here. When a logger is configured to rotate daily one just simply assume that it will rotate daily regardless the daemon restart. (As en extreme(?) example, when a the system - and the daemon of course - is restarted each day - or just never running for a whole day - and btw it logs often, the log file won't be ever rotated and can inrease over every limits.)

    I've quickly googled around and found someone who execute some logic from cron and expects the loghandler to rotate his files, that wouldn't happened for the same reason: https://stackoverflow.com/questions/30569164/logging-handler-timedrotatingfilehandler-never-rotates-the-log

    I just checked - the relevant code (using MTIME) was added in 2010, as a fix for bpo-8117 - around the time that Python 2.6 was released. As I understand bpo-8117, the goal was to do the rollover even after script restarts. (To be frank, only an extreme case was mentioned in the issue opener entry, when the rollover period had been reached when the script wasn't running.) I believe for this purpose MTIME wasn't the best choice, but using CTIME instead would make the script do what is expected to.

    I take it you mean TimedRotatingFileHandler. Certainly it assumes that the script might be restarted, which is why it opens the file in append mode.

    If the computation using MTIME were changed to use CTIME, that would be a behaviour change, which could conceivably break someone's code that relied on current behaviour (unlikely, but you just never know). I've no other objection to using CTIME rather than MTIME - perhaps I'm just being hyper-cautious about breakage?

    You are far better and more experienced in maintaining widely used opensource codebases so I wouldn't argue with you, but... :)

    I _believe_ no one expects the current initial behaviour in case of resumed log files, and for sure, the documentation doesn't cover this part so I don't know how to proceed.

    I _believe_ (again) that changing the behaviour wouldn't break the script outside, but rather would help them working the way how the developers originally assumed they did.

    vsajip commented 3 years ago

    You are far better and more experienced in maintaining widely used opensource codebases so I wouldn't argue with you, but... :)

    Well, if a change introduces a problem, it will be my lookout, won't it? Just as the original patch for bpo-8117 is proving to be :-)

    OK, how about this? We update the PR to add a new usectime=False keyword argument to TimedRotatingHandler.\_init__(). If nothing else changes, the behaviour is as it is currently. If use_ctime=True is passed in the instantiation, then the constructor will use CTIME rather than MTIME. Will that work for you? I'd be OK with that approach.

    stevendaprano commented 2 years ago

    See this thread on Discuss:

    https://discuss.python.org/t/logging-timedrotatingfilehandler-never-rotates-in-certain-cases/14747/1

    JChristensen commented 7 months ago

    I found this issue as I use logging with short-running scripts that are scheduled to run regularly. I had a script that was rotating weekly and working fine. I was surprised when I tried another script where I expected it to rotate daily and it did not.

    While I can see that things may be working as designed and documented, consider the apparent inconsistency created by the following two examples.

    The first works as expected, rotating the log file weekly. The second (where creating the handler on line 11 is the only significant change) I expected to rotate daily, but never does.

    #!/usr/bin/python3
    # This script is to be scheduled to execute and write to the log daily.
    # It should rotate the log files weekly.
    
    import logging
    import logging.handlers
    
    logger = logging.getLogger('daily_logger')
    logger.setLevel(logging.DEBUG)
    handler = logging.handlers.TimedRotatingFileHandler(
            'daily.log', when='w5', backupCount=4)
    logger.addHandler(handler)
    formatter = logging.Formatter(
            fmt='%(asctime)s:%(levelname)s:%(name)s:%(message)s',
            datefmt='%Y-%m-%d %H:%M:%S')
    handler.setFormatter(formatter)
    logger.info('Hello, world!')
    #!/usr/bin/python3
    # This script is to be scheduled to execute and write to the log hourly.
    # It should rotate the log files daily.
    
    import logging
    import logging.handlers
    
    logger = logging.getLogger('hourly_logger')
    logger.setLevel(logging.DEBUG)
    handler = logging.handlers.TimedRotatingFileHandler(
            'hourly.log', when='d', backupCount=4)
    logger.addHandler(handler)
    formatter = logging.Formatter(
            fmt='%(asctime)s:%(levelname)s:%(name)s:%(message)s',
            datefmt='%Y-%m-%d %H:%M:%S')
    handler.setFormatter(formatter)
    logger.info('Hello, world!')
    serhiy-storchaka commented 6 months ago

    I was just about to create the same issue. It is not so uncommon use case -- run a script by schedule instead of as a daemon and want to have a timed logs rotation. Even if you have a daemon, if it crashes more often than the rotation interval, this will prevent the logs rotation.

    I do not think that the new use_ctime option is needed.

    DmitriSaburov commented 5 months ago

    Let me say a couple of words too) I faced rotation bug about 10 years ago. The correction based on ctime not seems good to me. Lets see my real case. I have a program (on windows) running every 5 minutes by scheduler. And I want log rotation once per month (30 days). Due to the fact that base log creation time will not changing(windows feature :) every time program run the rotation will occur. Not what I want. Ten years ago I made a workaround(inherited class) that analyze ROTATED logs list in order to determine when rotation happened. For first rotation to work I used creation time. Something like that: In the end of MyTimedRotatingFileHandler.init: self.rolloverAt = self.computeRollover(self.getlastrollover()) where:

        def getlastrollover(self):
            """
            Get last rollover moment based on rolled logs
            :return: timestamp of last rollover or CREATE time of current log or current time
            """
            dirName, baseName = os.path.split(self.baseFilename)
            fileNames = os.listdir(dirName)
            result = []
            prefix = baseName + "."
            plen = len(prefix)
            for fileName in fileNames:
                if fileName[:plen] == prefix:
                    suffix = fileName[plen:]
                    if self.extMatch.match(suffix):
                        f2add = os.path.join(dirName, fileName)
                        mdft = os.stat(f2add).st_mtime
                        result.append((f2add, mdft))
            if len(result):
                sortedresult = sorted(result, key=lambda x: x[1])
                t = sortedresult[len(sortedresult)-1][1]
                del sortedresult
            else:
                if os.path.exists(self.baseFilename):
                    t = os.stat(self.baseFilename)[ST_CTIME]  # before first rolling check creation date
                else:
                    t = int(time.time())
             return t

    May be this can be useful for somebody. Best regards.