log4mongo / log4mongo-java

log4j appender to MongoDB database
http://log4mongo.org
Apache License 2.0
80 stars 29 forks source link

Issue with delays because of bson writes #12

Closed jaypatel512 closed 12 years ago

jaypatel512 commented 12 years ago

Today, we did some profiling in our tools, and it came out as a surprise that the most CPU consuming module in our product was log4mongo. We didnt realized that it would be cause so much trouble. We tried adding AsyncAppender in front of it, so as to make the overall operation faster, but wasnt a really big help.

Things went worse, when Mongo was in another VM, and there was a prominent network latency to pretty much jam up our application. (we use journalling and caching in mongo).

We had to reduce the logging level to WARN from INFO.

We should actually make it async by default if possible.

Thoughts ?

RobertStewart commented 12 years ago

I used VisualVM to do some profiling tonight.

With the MongoDbPatternLayoutAppender performance test increased to 10,000 loops and CPU profiling lowered to the minimum of 50 ms, I got the following results:

main thread - 4229 ms
    org.log4mongo.MongoDbPatternLayoutAppender.append - 4062 ms
        com.mongodb.DBCollection.insert - 2490 ms
        org.log4mongo.MongoDbPatternLayout.format - 1036 ms (almost 1/2 spent in log4J code)
        com.mongodb.util.JSON.parse - 505 ms

So about 36% was spent preparing the doc and most of the rest was spent inserting it. The overall times are long due to the impact of the profiler.

With the MongoDbAppender performance test increased to 10,000 loops and CPU profiling lowered to the minimum of 50 ms, I got the following results:

main thread - 8755 ms
    org.log4mongo.BSONAppender.append - 8301 ms
        org.log4mongo.MongoDbAppender.append - 5287 ms
            com.mongodb.DBCollection.insert - 5268 ms
        org.log4mongo.LoggingEventBsonifierImpl.bsonify - 3003 ms

2489 ms of the bsonify() method was spent getting location info, i.e., file, method, class and line, from Log4J and then adding it to the document. The next main contributor was bsonifyClassName.

Inserting the documents is about 2/3 of the execution time for this test for both appenders. Insert time is probably slower for MongoDbAppender because the documents are much larger.

Ideas for investigation:

jaypatel512 commented 12 years ago

Regarding Unsafe writes, as far as I know, the current source code, picks up the WriteConcern from Collection object, which by default has a Normal Write Concern. I made changes in my repo to allow WriteConcerns to be specified in properties file. I am sure if we make it to unsafe, it would be a performance improvement.

I will try MongoDBPAtternLayoutAppender today, and see if I can get any performance improvement in Log4Mongo.

jaypatel512 commented 12 years ago

According to http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PatternLayout.html, there are many paramters which takes more than usual to fetch the information. We used Custom Log4j PatternLayout, and we saw a substantial performance improvement.

Thanks Robert for looking into it.

RobertStewart commented 12 years ago

Sure. The caller location info is definitely very slow. Usually you don't need the file if you are logging the class name. I skip the method name, too, since you can get it from the line number. The line number is the only location info that I find really valuable.

Also, I log the category instead of the class, but by convention I set the category to be the same as the class, with a few exceptions. Of course, this can lead to confusion if you copy and paste your logger declaration and forget to change the category.

Looking at log4j.properties.sample, I see that I have even labeled the key as "class" instead of "category". Technically, that's not correct. It is dependent on the user setting the category to be the same as the class. I think that is a common convention, though.