tinylog-org / tinylog

tinylog is a lightweight logging framework for Java, Kotlin, Scala, and Android
https://tinylog.org
Apache License 2.0
684 stars 83 forks source link

Are there API exposed to forcefully flush the log buffer in File Writer mode ? #749

Open inovatrik-althaf opened 1 month ago

inovatrik-althaf commented 1 month ago

First of all thank you for this awesome project, we have been using this in production application with great success.

We are facing one challenge with file writer with buffer and thread enabled. From the documentation, as quoted below, It take some time for the buffer to flush the content to the file., that is ok. However, we have a situation were we allow the user to push the logs to our server , at that point the file logs will still be not generated as it is buffer mode. We would like to know if there are an apis exposed from the library to help forcefully flush the log buffer so that it will be written to the file ASAP. Or is there a feature coming soon, Or let us know area of code in the project , to be modified so that we our self can fixit locally for our project alone Or may be we can raise PR for the fix as well.

The file writer writes log entries to a defined file. The only required property is writer.file for the absolute or relative path to a log file. Optionally, the output can be buffered. In this case, new log entries are written to a buffer and output to the log file, together with a delay. This will be significantly faster, especially in connection with activation of the [writing thread](https://tinylog.org/v2/configuration/#writing-thread).

agentgt commented 4 days ago

What I recommend is that not only is there a flush but a reopen call like in my logging implementation: https://jstach.io/rainbowgum/#rolling and https://jstach.io/rainbowgum/io.jstach.rainbowgum/io/jstach/rainbowgum/LogOutputRegistry.html#flush()

@pmwmedia

I was actually quite surprised while testing Rainbow Gum using your benchmarks to find that tinylog does not actually have an async mode and that is a buffer which will periodically be purged by whatever thread goes over the limit.

I think that is disingenuous on your benchmark page and confuses users particularly in a hosted service environment where latency variance is a big deal. All the other engines have to do to match that is to turn on buffering at very high value, disable flushing and disable async... e.g. using a BufferedOutputStream.

The async modes in other frameworks have a dedicated writer thread pulling from something like a queue as fast as possible.

pmwmedia commented 2 days ago

@inovatrik-althaf What version of tinylog do you use? Since tinylog 2.7, log files should be flushed immediately as soon as all queued log entries a written. If the writing thread is enabled correctly, you shouldn't be faced with the described issue. Can you please share your tinylog configuration and version.

@agentgt Do you have implemented a benchmark to measure the different behaviors? If yes, I would be very interested in them. I have experimented a lot with logging benchmark. However, I wasn't able to measure anything significant.

agentgt commented 1 day ago

@agentgt Do you have implemented a benchmark to measure the different behaviors? If yes, I would be very interested in them. I have experimented a lot with logging benchmark. However, I wasn't able to measure anything significant.

Your benchmark is fine except that it does not test lock contention with multiple threads which is one way to test latency by looking at the error.

I have a fork that is not updated here: https://github.com/jstachio/tinylog/tree/rainbowgum I will push the updates later. I also switched tinylog to use its slf4j adapter as well as log4j2 since I think that makes it more fair.

I am not sure what the ideal benchmarking is for logging but I would say multiple parameters and multiple threads.

I plan on using Spring Petclinic and recording all the logging actions and then rebuilding them as code that is run spawning threads per distinct thread id found.

Tinylog's biggest weakness at the moment is that it generates a Location Info on every call. Also what I found out is while using the integer depths of the call stack is faster compared to locating via stack frame method/class name String.equals, the slow part is actually more just how deep the stack is. The deeper the depth to traverse the more expensive and it quickly out paces the cost of the string comparison.

Anyway I took your benchmark and added the JMH threads parameter to the Maven pom file and then ran it only doing output as discard mostly gets the same results (and rainbowgum is equal not surprising to tinylog as both have static loggers).

Here is the result of output with 4 threads but only 2 forks:

Benchmark                                      (async)          (locationInfo)  Mode  Cnt   Score   Error  Units
o.t.b.l.log4j2__.Log4j2__Benchmark.output        false                    NONE    ss    2  14.094           s/op
o.t.b.l.log4j2__.Log4j2__Benchmark.output        false  CLASS_OR_CATEGORY_ONLY    ss    2  13.712           s/op
o.t.b.l.log4j2__.Log4j2__Benchmark.output        false                    FULL    ss    2  25.350           s/op
o.t.b.l.log4j2__.Log4j2__Benchmark.output         true                    NONE    ss    2   1.748           s/op
o.t.b.l.log4j2__.Log4j2__Benchmark.output         true  CLASS_OR_CATEGORY_ONLY    ss    2   1.682           s/op
o.t.b.l.log4j2__.Log4j2__Benchmark.output         true                    FULL    ss    2   4.382           s/op
o.t.b.l.logback_.Logback_Benchmark.output        false                    NONE    ss    2  17.610           s/op
o.t.b.l.logback_.Logback_Benchmark.output        false  CLASS_OR_CATEGORY_ONLY    ss    2  16.320           s/op
o.t.b.l.logback_.Logback_Benchmark.output        false                    FULL    ss    2  36.016           s/op
o.t.b.l.logback_.Logback_Benchmark.output         true                    NONE    ss    2   1.400           s/op
o.t.b.l.logback_.Logback_Benchmark.output         true  CLASS_OR_CATEGORY_ONLY    ss    2   1.546           s/op
o.t.b.l.logback_.Logback_Benchmark.output         true                    FULL    ss    2  12.375           s/op
o.t.b.l.rainbowgum.RainbowGumBenchmark.output    false                    NONE    ss    2   1.658           s/op
o.t.b.l.rainbowgum.RainbowGumBenchmark.output    false  CLASS_OR_CATEGORY_ONLY    ss    2   1.754           s/op
o.t.b.l.rainbowgum.RainbowGumBenchmark.output    false                    FULL    ss    2   5.366           s/op
o.t.b.l.rainbowgum.RainbowGumBenchmark.output     true                    NONE    ss    2   1.172           s/op
o.t.b.l.rainbowgum.RainbowGumBenchmark.output     true  CLASS_OR_CATEGORY_ONLY    ss    2   1.368           s/op
o.t.b.l.rainbowgum.RainbowGumBenchmark.output     true                    FULL    ss    2   2.520           s/op
o.t.b.l.tinylog2.Tinylog2Benchmark.output        false                    NONE    ss    2  13.582           s/op
o.t.b.l.tinylog2.Tinylog2Benchmark.output        false  CLASS_OR_CATEGORY_ONLY    ss    2  18.226           s/op
o.t.b.l.tinylog2.Tinylog2Benchmark.output        false                    FULL    ss    2  21.302           s/op
o.t.b.l.tinylog2.Tinylog2Benchmark.output         true                    NONE    ss    2   1.662           s/op
o.t.b.l.tinylog2.Tinylog2Benchmark.output         true  CLASS_OR_CATEGORY_ONLY    ss    2   3.111           s/op
o.t.b.l.tinylog2.Tinylog2Benchmark.output         true                    FULL    ss    2   3.514           s/op

Lower is of course better here.

EDIT disregard the non async here for rainbowgum as I wasn't flush on every event. I didn't realize that was required for the non async benchmark.

EDIT I totally forgot why I had immediate flush disabled non async for rainbowgum but now I remember the reason was to show how any of the logging frameworks can just turn on massive buffering (e.g. no flush) and get largely the same results as the writer thread.

What needs to be separately tested with multiple threads is:

  1. True async where ideally even the encoding is done on a separate thread. A queue or ringbuffer is typically used.
  2. Buffering where flush does not happen on every item and perhaps a thread flushes periodically. While this can be seen largely the same as async it is not because encoding happens prior and a separate thread has to occasionally flush.
  3. No buffering synchronous.