Have a logging mechanism with instance-level granularity and separate log files.

GoogleCodeExporter commented 9 years ago

Current behavior
-   The connector, connector manager logs are all dumped into a single 
log file and only identified by their class names
-   It becomes difficult to debug if there are more than one instances 
running under same connector manager. All the log statements are 
intermingled.
Suggested behavior
-   Each connector instance would get their own log file and thus 
enable easier debugging.
-   The logs could be like <instance_name>_x.log and could be placed 
under logs folder or the working directory of the connector.
-   These logs could have CM logs related to these connector instances
-   The remote console API could be used to retrieve individual 
instance logs or ALL instance logs.
-   The CM could provide the customized logging mechanism as part of 
the SPI so that all connectors use a common approach. This implementation 
could be a thin wrapper over the existing logging (java, commons) 
mechanism.

Original issue reported on code.google.com by mwarti...@gmail.com on 4 Aug 2008 at 10:05

GoogleCodeExporter commented 9 years ago

Original comment by jl1615@gmail.com on 8 Dec 2008 at 11:34

Changed title: Have a logging mechanism with instance-level granularity and seperate log files.

GoogleCodeExporter commented 9 years ago

Original comment by jl1615@gmail.com on 8 Dec 2008 at 11:35

Changed title: Have a logging mechanism with instance-level granularity and separate log files.

GoogleCodeExporter commented 9 years ago

See also issue 97.

Original comment by jl1615@gmail.com on 9 Dec 2008 at 1:27

Changed state: Accepted
Added labels: Priority-High
Removed labels: Priority-Medium

GoogleCodeExporter commented 9 years ago

We have decided on a single log file with using MDC to provide information 
about the nature of the request, 
including the connector name and SPI interface (config, traversal, authN, 
authZ, by stylized name or possibly 
SPI interface name). We may want to include the username during authN and authZ 
requests, in cases where 
multiple requests are processed concurrently.

We are investigating slf4j, as a framework that supports MDC with both 
java.util.logging and log4j 
implementations. We could also add support for MDC directly to 
java.util.logging by implementing an MDC 
class along with a Formatter, or we could add logging to the SPI rather than 
requiring a specific logging 
package, but neither approach got much support.

Original comment by jl1615@gmail.com on 7 Mar 2009 at 7:27

GoogleCodeExporter commented 9 years ago

It would be good to have a Logging Wrapper class in the SPI for abstraction 
purpose. 
Any changes to the underlying mechanism can be consumed by the SPI & no 
connector 
code will be affected.
This will also ensure uniformity across all connectors.
What would be the performance overhead if we are planning to log Thread Names 
(Connector Instance names) Or are we planning to use different mechanism for 
associating log entries to instance names?

Original comment by mwarti...@gmail.com on 25 Mar 2009 at 3:55

GoogleCodeExporter commented 9 years ago

Having a Logger Wrapper will require connector SPI changes. Reasons:
* The connector code will have to pass the connector name to the Logger 
Wrapper.This
implies that the connector instance name be passed by Connector Manager to the
connector which means change in SPI OR
* The Logger Wrapper will have to infer this information (how I don't know). 
This
implies that the Logger Wrapper will have to keep a track of the info related 
to a
connector instance

Using MDC:
* The MDC approach does not require any of the above but rather just use APIs 
from SLF4j
* The connector instance name will just be an extra piece of information added 
by
Connector Manager for the connector instance thread to the underlying logging
library. This info is used while generating the log messages
* Performance overhead
  - The SLF4j parameterized logging is supposed to be far better than using if
(logger.isDebugEnabled ()) {logger.debug("Debug message" + variableValue);} 
approach
as with log4j
  - My reading till now suggests that aren't many performance issues reported
specific to MDC. If there are any performance overheads, it will be with the
underlying library that supports MDC. Will require testing with the CM-Connector
framework though to confirm  
* One thing to consider here is that SLF4j supports one and only one binding of 
a
logging library at a time. 

In future, we should look at Logger Wrapper that will seggregate logs for:
* Config
* Traversal
* Sending feeds
* Handling AuthN, AuthZ

Right now, we should concentrate on capturing connector instance name info in 
the logs

Original comment by rakeshs101981@gmail.com on 25 Mar 2009 at 9:06

GoogleCodeExporter commented 9 years ago

Original comment by mgron...@gmail.com on 6 May 2009 at 9:36

Added labels: Milestone-Goal_2.0.0

GoogleCodeExporter commented 9 years ago

Fixed in r1999.

The final design uses NDC to log the dynamic context (e.g., connector names, 
usernames) in a single log file. 
Connectors are unaffected, and continue to use java.util.logging as before.

Original comment by jl1615@gmail.com on 15 May 2009 at 10:03

Changed state: Fixed
Added labels: Milestone-Release_2.0.0
Removed labels: Milestone-Goal_2.0.0

GoogleCodeExporter commented 9 years ago

From:   Brett.Michael.Johnson@gmail.com
Subject:    Re: a medium-size code review (Brett.Michael.Johnson/logging)
Date:   May 11, 2009 5:17:03 PM PDT

Re: slf4j vs home-rolled:
-------------------------
I got some pushback against using slf4j.  I liked it, but it wasn't 
clear that we could get away with not having to sweep through
the connectors, changing them from java.util.logging to slf4j.
It also appears to be difficult to get slf4j running smoothly under
Tomcat, which really, really, really, wants to use commons-logging.
[Although I did see a project on google code-site that makes it easier:
http://code.google.com/p/tomcat-slf4j ]

I selected log4j-style MDC, NDC, and LayoutPattern, as most 
developers are probably already familiar with it.

Re: NDC.pushAppend():
---------------------
NDC.pushAppend() came after about the 15th time I typed:
NDC.push(NDC.peek() + " " + something);.  Maybe I was 
using NDC wrong, but this seemed like an obvious oversight.

Re: Running in Tomcat:
----------------------
As far as packaging the logging classes into a jar file, I chose a 
variation of option 2):   place logging in a separate jar file, but not a 
separate project.  It is built like connector-spi.jar and connector-util.jar.

Using it in our hacked up Tomcat installs requires adding the
connector-logging.jar to the system classpath in 
${catalina.base}/bin/setclasspath.sh
(and .bat):
  export CLASSPATH="$CLASSPATH":"$BASEDIR"/webapps/connector-manager/WEB-INF/lib/connector-
logging.jar
This allows java.util.logging.LogManager to see my custom formatter.

In the connector-manager/WEB-INF/classes/logging.properties file:
Change:
  java.util.logging.FileHandler.formatter = java.util.logging.SimpleFormatter
to:
  java.util.logging.FileHandler.formatter = com.google.enterprise.connector.logging.SimpleFormatter

Add:
  com.google.enterprise.connector.logging.SimpleFormatter.format="%d{MMM dd, yyyy h:mm:ss a} [%x] %C 
%M%n%p: %m%n"
[or not, as that is the default pattern].  Or, for the 1-line version:
  com.google.enterprise.connector.logging.SimpleFormatter.format=%d [%x] %p: %C{3}.%M(): %m%n

Remove:
  java.util.logging.FileHandler.layout.ConversionPattern=%d{MM-dd@HH:mm:ss} %5p [%t] (%F:%L) - %m%n
and 
  java.util.logging.ConsoleHandler.layout.ConversionPattern=%d{MM-dd@HH:mm:ss} %5p [%t] (%F:%L) - %m%n
as these appear to be log4j configuration properties and are unused,
even in our current configuration.

More configuration details to follow.  Putting the connector-logging.jar file in
${catalina.base}/bin, ${catalina.base}/shared/lib, ${catalina.base}/common/lib
had no effect, as Tomcat doesn't put those on the system classpath.  Since
I had to specify the path down to the jar file, I decided to keep it in the
connector manager lib directory.

Re: XMLFormatter and Chainsaw:
------------------------------
I haven't written an XML formatter for this yet, but I suspect the output will
look very much like java.util.logging.XMLFormatter, with the NDC/MDC
information either as attributes on the <record> tag or as a separate element.
Either way, I could wrap java.util.logging.XMLFormatter, and just add my little
bit of MDC/NDC info.  I want to play with Chainsaw a bit before I try.

Re: Testing with connector instances:
I've been testing this with three concurrent instances of the stock 1.3.2 
Livelink Connector and it works great.  Another thing I've been testing is 
a 1-line format, rather than the java.util.logging.SimpleFormatter 2-line style.
1-line formats make grepping the logs *much* easier to do.

Brett Johnson

Original comment by Brett.Mi...@gmail.com on 17 May 2009 at 7:25

GoogleCodeExporter commented 9 years ago

From:   brett.michael.johnson@gmail.com
Subject:    Re: a medium-size code review (Brett.Michael.Johnson/logging)
Date:   May 14, 2009 12:55:49 PM PDT

I would like to thank you all for your excellent feedback
on the MDC/NDC logging change list.  I've spent the last
couple of days incorporating most of that feedback, and
the new snapshot I have posted is significantly cleaner,
more svelte, yet still more comprehensive.  There is also
improved testing and documentation.

I designed this package so that it could be used by other
projects, if desirable.  So, although I removed some
superfluous functionality, I didn't strip out everything
we didn't use.

Looking through the comments, there were some common threads
of concern, confusion, and suggestions.  I can address them
now:

On using MDC vs NDC:
-------------------
I implemented the MDC class first, then started using it
in the code to add context to the messages.  At first, I
was adding more context than I do now, and in different
places in the code base. The evidence of that approach
could be seen in a couple of messages that were more
verb-subject-object.  Anyway, based upon my initial pattern
of use, in a few places, I really wanted the ability to save
and restore context.  Therein grew the push/pop extensions
to MDC.  At that point I decided to look at NDC.  I hadn't
read the javadoc, so I imagined that NDC would push and
pop a whole MDC-like context.  In fact, my first
implementation of NDC was a subclass of MDC that provided
push/pop of the context map.  I later read the javadoc
and implemented it as spec'd.

The other problem I was running into with MDC was formatting
the messages nicely.  I could have a formatter that injected
verbose MDC context into the logs - something like:
"[Service:%X{Service} Connector:%{Connector} User:%{User}]",
but that seemed excessively verbose, and not all contexts
would be populated, so empty "User:" entries filling the
logs didn't seem like a good idea.  I considered dropping
the key and just formatting the values:
"[%X{Service} %{Connector} %{User}]"
That was more compact, and empty values were only penalized
with the surrounding whitespace.

However the biggest issue I had with MDC was the fact that,
although I sometimes 1 piece of context, occasionally had 2
pieces of context, and often had 3 pieces of context, they
weren't always a Service, ConnectorName, and UserName.  It
was sometimes a ConnectorType, sometimes some portion of the
Connector Manager was the subject of configuration, not a
connector.  I could have MDC.put("ConnectorName", "Manager"),
or named them "Thing1", "Thing2", "Thing3".

Instead, I chose to use NDC.  In 80% of my cases of setting
and clearing the context, choosing MDC over NDC, would be
either/or.  No real advantage one way or the other.  But
in a couple of instances NDC seemed a little nicer, and
more in tune with my thought processes.  John seems quite
put off by the use of NDC.  But as I look at the code now,
replacing the use NDC with MDC would not significantly
clean up the code, but it would make the logs a bit less
tidy.

On the format of the messages:
------------------------------
The format of the messages evolved, but they have settled
down to being a service name, plus one or two other items,
usually connectorName being one.  The service names are
{Init, AuthN, AuthZ, Config, Traverse, Support, Shutdown}
John didn't like Init.  I do. When logging Finest, there
are 860 Init log messages generated when starting up my
multi-connector instance Tomcat system.  It is convenient
to locate the end of the startup phase.

On NDC.pushAppend() addition:
-----------------------------
It just seemed an obvious extension after about the
15th time I typed NDC.push(NDC.peek() + " " + something),
and an obvious reflection of my thought process while
adding context.

On clean room implementation:
-----------------------------
As I mentioned above, I implemented NDC, MDC, and LayoutPattern
based upon reading the log4j javadoc, not the source code.
Early on, I did look at the slf4j source for some inspiration,
but didn't get any.  That said, MDC and NDC are very thin
wrappers over a ThreadLocal HashMap and Stack.  At least
it is in my implementation.  But given the spec'd interface
and the use of ThreadLocal Map and Stack, I can't imagine
an implementation that looks too different than this one.

On using NDC.clear() and NDC.remove():
--------------------------------------
After considering how ThreadLocals worked, and reading a
couple of tutorials on using MDC logging for web services,
I gained a better understanding of when and where to use
M/NDC.clear() and M/NDC.remove().  Technically, clear()
removes all entries from the context store, calling
Map.clear() or Stack.clear(), but preserves the ThreadLocal
Map or Stack itself.  New context may then be created
with put() or push().  M/NDC.remove() not only clears
the context store destroys the context store, removing
the ThreadLocal (via ThreadLocal.remove()) and allowing
it get GC'd.

The doc indicates that a thread should call remove()
before exiting so that the ThreadLocals get GC'd. But
the ThreadLocals seem to be somewhat expensive to create
and destroy.  For cases where a thread will get reused,
and desire diagnostic context, it is preferred you use
clear(), rather than remove() and recreate.  This is
useful in thread pools, where we can clear the diagnostic
content just before returning the thread to the pool.

Given this knowledge, it became clear the my initial use
of clear() and remove(), was quite flawed in both the
locations and consistency of use.

I have subsequently changed my model and pushed these calls
out to the edges of the product.  Threads we create, like
those in TraversalScheduler and WorkQueue, now call
NDC.remove() just before they exit their run() method.
If anyone knows how to hook an onExit() type call on a
Java Thread, please let me know.  For the Servlets, the
threads belong to Tomcat, but they originate in a thread pool.
In those cases, I employ a try {} finally { NDC.clear(); }
strategy at the servlet interfaces [typically in doGet(), and
doPost()] to clear the logging context before leaving the
Connector Manager servlets.  The handlers no longer have
calls to NDC.clear() or NDC.remove(), instead employing
NDC.pop() if appropriate.  I am also less stringent with
using pop() inside a try/finally construct for shallow
logging contexts, as abnormal exits from the handlers
would have their logging context cleared up by the
NDC.clear() in the outermost finally{} clause.

On logging configuration in Tomcat:
-----------------------------------
The logging of diagnostic content is done using a
custom logging Formatter assigned to the logging
FileHandler.  In our shipping Tomcat configuration,
we bypass Tomcat's JULI logging configuration, and the
webapp's logging.properties specifies the use of
java.util.logging.FileHandler for the logs.  In order
for java.util.logging.FileHandler to locate my custom
Formatter, that Formatter *must* be on the system classpath.

I have packaged the com.google.enterprise.logging.*
classes in a separate jar file, connector-logging.jar,
that at this point would get installed in
${catalina.base}/webapps/connector-manager/WEB-INF/lib
along with the rest of the Connector Manager jar files.
That jar file must be placed on the system classpath.

For security reasons, Tomcat ignores the CLASSPATH
environment variable at startup, so classpath configuration
is done via the scripts ${catalina.base}/bin/setclasspath.sh
or ${catalina.base}/bin/setclasspath.bat.  You need to
modify these scripts, adding the connector-logging.jar file
to the CLASSPATH variable as it it being built.  Our
Installers already modify the classpath here when installing
the Documentum connecter.  The Installer will now need to
modify the classpath for all connector installations.
For instance, my Tomcat setclasspath.sh looks like:

# Google Enterpise Connector Logging
CONNECTOR_LOGGING="$BASEDIR"/webapps/connector-manager/WEB-INF/lib/connector-log
ging.jar
if [ -f "$CONNECTOR_LOGGING" ] ; then
  CLASSPATH="$CLASSPATH":"$CONNECTOR_LOGGING"
fi

At this point, I don't think it would be advantageous to
try to install the connector-logging.jar file in Tomcat's
shared/lib or common/lib directories, especially on Tomcat 6,
as they don't exist by default.

Installer writers, the BASEDIR variable is supplied to the
setclasspath script by the Tomcat's startup script.  I encourage
you to use it here, and not try to put the full pathname to
the jar file into the script.

Once the classpath has been set, you need to specify the
custom logging Formatter in the webapp's logging.properties file,
located at
${catalina.base}/webapps/connector-manager/WEB-INF/classes/logging.properties

In that file, change the Formatter as follows:
java.util.logging.FileHandler.formatter = 
com.google.enterprise.connector.logging.SimpleFormatter

You may optionally set the logging layout conversion pattern
as well (the one specified here is the default):
com.google.enterprise.connector.logging.SimpleFormatter.format = %d{MMM dd, 
yyyy h:mm:ss a} [%x] %C 
%M%n%p: %m%n

You should probably remove the vestigial log4j conversion
patterns.  They aren't even used in our shipping product:
java.util.logging.FileHandler.layout.ConversionPattern = ...
java.util.logging.ConsoleHandler.layout.ConversionPattern = ...

At this point, given my custom Formatter, I can't easily figure
out which logging Handler it is bound to, otherwise I too would
look for the java.util.logging.FileHandler.layout.ConversionPattern
property.

On XMLFormatter and Chainsaw:
-----------------------------
I haven't written an XML formatter for this yet, but I suspect the output will
look very much like java.util.logging.XMLFormatter, with the NDC/MDC
information either as attributes on the <record> tag or as a separate element.
Either way, I could wrap java.util.logging.XMLFormatter, and just add my little
bit of MDC/NDC info.  I want to play with Chainsaw a bit before I try.

Re: Testing with connector instances:
I've been testing this with three concurrent instances of the stock 1.3.2
Livelink Connector and it works great.  Another thing I've been testing is
a 1-line format, rather than the java.util.logging.SimpleFormatter 2-line style.
1-line formats make grepping the logs *much* easier to do.

Original comment by Brett.Mi...@gmail.com on 17 May 2009 at 7:30

GoogleCodeExporter commented 9 years ago

27 May 2009 Connector Manager revision r2091 adds XmlFormatter.

Add a custom XML logging formatter that supports NDC logging.
The custom XML logging formatter is configurable - it can
generate output that resembles java.util.logging.XMLFormatter
output or output that resembles log4j XMLLayout output.

com.google.enterprise.connector.logging.XmlFormatter extends
java.util.logging.Formatter, inserting an NDC element.

The XML syntax generated is mildly configuratble by a
configuration paramater in logging.properties.  By default,
XmlFormatter generates XML that resembles the output from
java.util.logging.XMLFormatter, with the addition of an
NDC element to the log record.

The java.util.logging XML output is easily human readable, however
the popular log analyzer tool, Chainsaw, while it can read
java.util.logging files, does not recognize the NDC addition.

This formatter supports a configuration property that will
change the generated output to resemble that of log4j,
rather than java.util.logging.  The log4j-compatible output
is parsable by Chainsaw, including the additional NDC data.

To use this custom XML log Formatter, configure in logging.properties:
classname.of.logging.FileHandler=com.google.enterprise.logging.XmlFormatter

To enable Log4j-compatible output, add the property:
com.google.enterprise.logging.XmlFormatter=log4j
  or
com.google.enterprise.logging.XmlFormatter=chainsaw

Original comment by Brett.Mi...@gmail.com on 29 May 2009 at 4:27

KalibriCuga / google-enterprise-connector-manager

Have a logging mechanism with instance-level granularity and separate log files. #107