gabime / spdlog

Fast C++ logging library.
Other
24.11k stars 4.5k forks source link

Daily rotation does not remove oldest files if one day is missing #2553

Open daniele77 opened 1 year ago

daniele77 commented 1 year ago

If for any reason one of the old files is missing (e.g., because one day the application has not logged), the oldest file is no more removed. For example, having these files:

myapp_2022-11-25.log
myapp_2022-11-24.log
myapp_2022-11-23.log
myapp_2022-11-21.log
myapp_2022-11-20.log

myapp_2022-11-21.log and myapp_2022-11-20.log won't be never deleted.

This function causes the issue:

https://github.com/gabime/spdlog/blob/v1.x/include/spdlog/sinks/daily_file_sink.h#L183

void init_filenames_q_()
{
    using details::os::path_exists;

    filenames_q_ = details::circular_q<filename_t>(static_cast<size_t>(max_files_));
    std::vector<filename_t> filenames;
    auto now = log_clock::now();
    while (filenames.size() < max_files_)
    {
        auto filename = FileNameCalc::calc_filename(base_filename_, now_tm(now));
        if (!path_exists(filename))  // <---------------- stop at the first file missing
        {
            break;
        }
        filenames.emplace_back(filename);
        now -= std::chrono::hours(24);    // <-------- goes backward 24h
    }
    for (auto iter = filenames.rbegin(); iter != filenames.rend(); ++iter)
    {
        filenames_q_.push_back(std::move(*iter));
    }
}

I guess one fix could be to change the function so that it enumerates the files with the right pattern and remove the oldest one based on the file timestamp.

gabime commented 1 year ago

Or remove the break. The problem is that in both cases it might take long time. Depending on the max_files_ value.

daniele77 commented 1 year ago

You can't remove the break: when there are less than maxfiles files, the while loop can't exit. The solution I was proposing does not take a long time of execution.

gabime commented 1 year ago

I think the cleaning function could be fixed to stop relying on the filenames_q_ and search for old files to delete instead. If this fixed, the entire filenamesq mechanism can be removed, which would simplify the code significantly.

PR is welcome.

daniele77 commented 1 year ago

I agree. I'm pretty busy right now, so I'm not sure I'll be able to submit a PR anytime soon.

andy-byers commented 1 year ago

I was thinking that we can't guarantee that there won't be a ton of files to look through, so what if we just allowed a certain number of missing files in the search, something like:

int allowed_missing = 32;
while (filenames.size() < max_files_)
{
    auto filename = FileNameCalc::calc_filename(base_filename_, now_tm(now));
    if (path_exists(filename))
    {
        filenames.emplace_back(filename);
    } else if (--allowed_missing <= 0)
    {
        break;
    }
    now -= std::chrono::hours(24);
}

I know it' not very robust, but is that a sensible compromise? Otherwise, I'd be willing to implement searching for files to remove. It seems like we could repurpose code from count_files() to do this.

daniele77 commented 1 year ago

@andy-byers this could be a partial solution, but you should have the const allowed_missing big enough to take into account the case when one starts the application a few months after the last run (e.g., an embedded system that you use just once in a while).

To manage the general case, you would end up with an allowed_missing so big that every time the algorithm must check hundreds of paths before stopping. And this would happen when there is no gap in the log files, as well.

So, what's the worse between:

I've no doubt the best solution is to check all the files in the directory for log files name.

gabime commented 1 year ago

I've no doubt the best solution is to check all the files in the directory for log files name.

I second that.

andy-byers commented 1 year ago

@daniele77 and @gabime: that makes sense to me. I'll start a fork and see what I can do. Thanks!

mtn241 commented 1 year ago

If checking the name of all the files in the directory is inevitable maybe it's worth to divide maxfiles into two different N:

This will extend sink options and simplify logic for creator (date or number but not both ). Strategy for each N can be injected into daily_file_sink constructor.