Open kimi20071025 opened 2 years ago
Could you please point out the exact location where the problem occurs?
Could you please point out the exact location where the problem occurs?
When one thread is logging to file, it will call the function LogFileObject::Write()
. The function LogFileObject::Write()
will call log_cleaner.enabled()
. The log_cleaner.enabled()
visits the enabled_
without lock guarded.
meanwhile other thread try to change the enabled_
by call function EnableLogCleaner
or DisableLogCleaner
which will write the enabled_
without lock guarded.
so may be it lead to undefined behaviour? I am just in doubt of this. May be I am wrong.
Thanks for response.
Is it possible that you mean a race condition and not undefined behavior? The former can occur because EnableLogCleaner
and DisableLogCleaner
are not atomic. However, I believe the race condition is tolerable because at most what can happen are missed cleanups which are made up for once EnableLogCleaner
is called again.
cc @aesophor
Is it possible that you mean a race condition and not undefined behavior? The former can occur because
EnableLogCleaner
andDisableLogCleaner
are not atomic. However, I believe the race condition is tolerable because at most what can happen are missed cleanups which are made up for onceEnableLogCleaner
is called again.cc @aesophor
Yes.
One thread is reading enabled_
while another is writing it. May the race condition leads to undefined behavior ?
It must not leads to undefined behavior because enabled_
is bool
?
I'm not sure what are you asking. Just because a boolean is not atomic it will not immediately cause undefined behavior. Also, undefined behavior is generally not prescribed by a library but by the language standard.
I'm not sure what are you asking. Just because a boolean is not atomic it will not immediately cause undefined behavior. Also, undefined behavior is generally not prescribed by a library but by the language standard.
OK. If the enabled_
is uint8_t
, the race condition probably cause undefined behavior?
Quoting from @sergiud:
I believe the race condition is tolerable because at most what can happen are missed cleanups which are made up for once EnableLogCleaner is called again.
@kimi20071025 We understand the possible race condition you've mentioned. Are you working on something that can trigger this race condition and is intolerable?
IMO, if we want to eliminate race condition completely, just making enabled_
an std::atomicLogCleaner::Enable(int days)
, the value of LogCleaner::overdue_days_
at the end can be indeterministic. However, if we use a mutex to protect the entire body of both LogCleaner::Enable()
and LogCleaner::enabled()
, then they will bring overhead for those who use this API with a single thread.
In this case, is there a better option other than a mutex?
Quoting from @sergiud:
I believe the race condition is tolerable because at most what can happen are missed cleanups which are made up for once EnableLogCleaner is called again.
@kimi20071025 We understand the possible race condition you've mentioned. Are you working on something that can trigger this race condition and is intolerable?
IMO, if we want to eliminate race condition completely, just making
enabled_
an std::atomic is probably not enough: Imaging two threads callingLogCleaner::Enable(int days)
, the value ofLogCleaner::overdue_days_
at the end can be indeterministic. However, if we use a mutex to protect the entire body of bothLogCleaner::Enable()
andLogCleaner::enabled()
, then they will bring overhead for those who use this API with a single thread.In this case, is there a better option other than a mutex?
What I am worried about is not race condition but data race.
Glog library try to clean overdue logs every time when logging. I think it may cost much performance If LogCleaner is opened all the time. So Once a day, I enable log cleaner (EnableLogCleaner
) at night , do an error logging and close log cleaner (DisableLogCleaner
).
When a thread is logging, it will finally call LogFileObject::Write
, the function LogFileObject::Write
calls log_cleaner.enabled()
.
which means the thread reads enabled_
.
Meanwhile, other thread calls EnableLogCleaner
or DisableLogCleaner
, the thread writes enabled_
.
So what I worried is the condition may cause data race .
I had to actually look this up. But it seems that concurrent reading/writing of/to a variable is indeed undefined behavior in C++.
However, at this point I would rather claim that your use case is not supported in the sense that you should not be calling EnableLogCleaner
and DisableLogCleaner
from different threads. As @aesophor already mentioned, a mutex is necessary to solve the data race which will however incur an overhead at little or no gain. We can certainly reconsider once we have the means for proper synchronization (i.e., C++11 <thread>
functionality). But at this point, the effort making these functions thread safe is not worth it.
I think it may cost much performance
How did you determine the performance penalty, or is this only a guess? If it's the latter, you should measure quantitatively first and let us know how strong the performance degradation is if the cleaner has been enabled. This is something worth improving.
About the log cleaner performance, I shared the same opnion with issue #702 . Much performance cost when enable log clean.
And I also read the benchmark #761 , could @aesophor help to make other benchmark comparing the logging performance between enable log clean with disable log clean?
If there is much performance penalty comparing by disable log clean. It means the LogCleaner can not be enabled all the time.
Platform:
- Arch Linux
- CPU: Intel i7-10700 CPU @ 2.90GH (16-core)
- Memory: 16 GB ...
When there are 100 files in log directories, the execution time of
LogCleaner::Run()
is 0.157 ms in avg.
Upon each call to LogFileObject::write()
, if the log cleaner is enabled and if there are 100 files to check, it only takes extra 0.157 ms to run.
About the log cleaner performance, I shared the same opnion with issue #702 . Much performance cost when enable log clean.
What's exactly "much performance" in absolute numbers? Issue #702 did not provide any details either. Therefore I don't see any factual arguments.
#include "glog/logging.h"
#include <unistd.h>
#include <chrono>
#include <iostream>
using namespace std;
using namespace chrono;
int main(int argc, char *argv[])
{
google::InitGoogleLogging("");
FLAGS_logbufsecs = 0;
FLAGS_max_log_size = 100;
google::SetLogDestination(google::GLOG_ERROR, "/tmp/glog/ERROR_");
google::SetLogDestination(google::GLOG_WARNING, "");
google::SetLogDestination(google::GLOG_INFO, "");
FLAGS_colorlogtostderr = false;
FLAGS_alsologtostderr = false;
google::SetStderrLogging(google::GLOG_FATAL);
google::EnableLogCleaner(1);
auto enableLogCleanStart = system_clock::now();
for (uint16_t i = 0; i < 50000; i++)
{
LOG(ERROR)<<"glog error";
}
auto enableLogCleanEnd = system_clock::now();
std::chrono::duration<double, std::milli> enableLogCleanDuration = enableLogCleanEnd - enableLogCleanStart;
cout<<"cost: "<<enableLogCleanDuration.count()<<" ms when enable log clean"<<endl;
google::DisableLogCleaner();
auto disableLogCleanStart = system_clock::now();
for (uint16_t i = 0; i < 50000; i++)
{
LOG(ERROR)<<"glog error";
}
auto disableLogCleanEnd = system_clock::now();
std::chrono::duration<double, std::milli> disableLogCleanDuration = disableLogCleanEnd - disableLogCleanStart;
cout<<"cost: "<<disableLogCleanDuration.count()<<" ms when disable log clean"<<endl;
return 0;
}
cost: 518.541 ms when enable log clean
cost: 225.845 ms when disable log clean
When there are 1 file in log directories and log cleaner is opened, the log performance will decrease 230% comparing with log cleaner is closed.
I believe that the overhead (518.541 ms) is because:
If your application needs to write so frequently but it's performance critical at the same time, then you can use another thread to occasionally perform cleanup. You are also welcome to submit a PR and improve it.
A possible optimization strategy: use CycleClock_Now()
to maintain an additional timer such as next_cleanup_time_
.
you can use another thread to occasionally perform cleanup
This is what I already do, But the date race of enabled_
should be avoid as I said at the begin of this issue.
I read the functions of
EnableLogCleaner
,DisableLogCleaner
andLogCleaner::enabled()
.I find the private member
enabled_
ofLogCleaner
is not lock guard.Once one thread call
EnableLogCleaner
,and other thread is logging, It may lead to undefined behaviour?