Closed TwP closed 4 years ago
Thanks for looking into this. I've modified the script to use /tmp
which is tmpfs on my Fedora, just to prevent SSD wearing. However I see some entries being dropped :-(
$ cat /tmp/*.log | wc -l
59962
It's random:
$ cat /tmp/*.log | wc -l
59971
Okay it looks like those lines are being droped around the time when files are rolled. Without much understanding of the code, I just blindly made this change:
diff --git a/lib/logging/appenders/rolling_file.rb b/lib/logging/appenders/rolling_file.rb
index a187fbc..a8eced2 100644
--- a/lib/logging/appenders/rolling_file.rb
+++ b/lib/logging/appenders/rolling_file.rb
@@ -157,7 +157,9 @@ module Logging::Appenders
return self if @io.nil?
str = str.force_encoding(encoding) if encoding && str.encoding != encoding
- @io.flock_sh { @io.write str }
+ @mutex.synchronize {
+ @io.flock_sh { @io.write str }
+ }
if roll_required?
@mutex.synchronize {
And it fixes the problem, I am able to scale up to 100,000 thousands of lines on my 16core AMD CPU without a single line being dropped.
Which actually raises an important question - it looks like this is a common bug across all appenders. The mutex attribute is created in the Appender base class, but then it's used across appenders like in reopen
in File
, Syslog
or String
appenders. Problem is, most of these classes use write method from IO
base class which does not have synchronized block on the mutex from the Appender class. If I am not mistaken, all of the classes are subject of the similar problem - when output is being reopened, logs can be dropped.
I think it would be safer to declare the mutex for each individual appender. This would encourage to take more care wrapping all methods into synchonized blocks per individual class. Also, the shared mutex for all appenders represents a bottleneck - if I have multiple appenders and one of the appenders is blocking the mutex because of some expensive operation, all of the other appenders can't write. Now, I am on thin ice here and maybe it's not the case.
The mutex could be also defined in another abstract class via decorator design patter, that would make sure all of the important methods (e.g. write) are synchronized. It would declare the mutex too, so it's only available to its children.
Yup, I am able to reproduce:
require_relative 'lib/logging'
filename = "/tmp/file.log"
Logging.raise_errors = true
appender = Logging.appenders.file filename, layout: Logging.layouts.pattern(pattern: "[%d] %T %5l -- %c: %m\n")
log = Logging.logger.root
log.add_appenders(Logging.appenders[filename])
log.level = :debug
threads = []
(1..16).each { |thread_count|
threads << Thread.new do
Thread.current[:name] = "T-%02d" % thread_count
(1..1_000).each { |n|
log.debug "#{n} a very nice little debug message"
log.warn "#{n} this is your last warning"
}
end
}
appender.reopen
appender.reopen
threads.each {|t| t.join}
When I try this, I am actually getting an exception because other threads try to write while stream is closed:
/home/lzap/work/logging/lib/logging/appenders/io.rb:77:in `write': closed stream (IOError)
Traceback (most recent call last):
11: from /home/lzap/work/logging/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'
10: from test-reopen.rb:17:in `block (2 levels) in <main>'
9: from test-reopen.rb:17:in `each'
8: from test-reopen.rb:18:in `block (3 levels) in <main>'
7: from /home/lzap/work/logging/lib/logging/logger.rb:93:in `debug'
6: from /home/lzap/work/logging/lib/logging/logger.rb:406:in `log_event'
5: from /home/lzap/work/logging/lib/logging/logger.rb:406:in `each'
4: from /home/lzap/work/logging/lib/logging/logger.rb:406:in `block in log_event'
3: from /home/lzap/work/logging/lib/logging/appender.rb:81:in `append'
2: from /home/lzap/work/logging/lib/logging/appenders/buffering.rb:285:in `write'
1: from /home/lzap/work/logging/lib/logging/appenders/io.rb:77:in `canonical_write'
/home/lzap/work/logging/lib/logging/appenders/io.rb:77:in `write': closed stream (IOError)
Something like this (this is really ugly) fixes it:
diff --git a/lib/logging/appenders/file.rb b/lib/logging/appenders/file.rb
index 9ed328a..f48c97e 100644
--- a/lib/logging/appenders/file.rb
+++ b/lib/logging/appenders/file.rb
@@ -94,5 +94,12 @@ module Logging::Appenders
rescue Errno::EEXIST
open_file
end
+
+ def canonical_write( str )
+ @mutex.synchronize {
+ super(str)
+ }
+ self
+ end
end
end
All good now:
$ cat /tmp/*.log | wc -l
64000
Quickly looking at the code, I think it is reasonable to implement a SynchronizedIO
appender defining the mutex. Only the Console
appender does not need synchronization, I mean does it make sense to reopen stdout? During process daemonization programs reopen stdin stream itself and not the appender, so I think I am right. Therefore all the appenders except Console
would use the new SynchronizedIO
abstract class instead.
I saw the missing log line as well. I'm going to merge this PR since it address your original issue, and I will tackle the synchronized writing (and a few other bugs) in separate PRs. This is largely to help me keep the release notes notes straight.
More fixes forthcoming :smile_cat:
There is a bug in the log rolling code where multiple threads in the same process can contend on rolling the files. The result is that we will delete log files and drop log messages. I grabbed the example from @lzap and modified the code a bit to get a better handle on the error.
This little script will produce a total of 60,000 log lines. We can verify that all log lines are present by running this script and the counting the lines in the generated log files ...
And now we are getting the expected 60000 log lines
refs #206 /cc @lzap