Open huxi opened 8 years ago
As far as log level for error codes, I'm fine with WARN
but I do know that there are other use cases (hyperic) where WARNs will start cascading error alerts.
I'd be happy to see this implemented
@ratpack/committers thoughts?
Forcing the locale to English makes sense.
Wouldn't the status code markers then make us not be using NCSA format?
Slf4j message pattern… yes please.
Regarding the warning logging, I don't think that should be the default. I could be convinced of an opt in though.
More tests… yes please.
The markers are invisible if not explicitly enabled in the message pattern.
In log4j2, a config making use of markers would look like this
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="INFO">
<Appenders>
<List name="requestList" />
<List name="serverErrorRequestList">
<MarkerFilter marker="status-5xx" onMatch="ACCEPT" onMismatch="DENY"/>
</List>
<List name="errorRequestList">
<Filters>
<MarkerFilter marker="status-5xx" onMatch="ACCEPT" onMismatch="NEUTRAL"/>
<MarkerFilter marker="status-4xx" onMatch="ACCEPT" onMismatch="DENY"/>
</Filters>
</List>
</Appenders>
<Loggers>
<Logger name="ratpack.request" level="INFO" additivity="false">
<AppenderRef ref="requestList"/>
<AppenderRef ref="serverErrorRequestList"/>
<AppenderRef ref="errorRequestList"/>
</Logger>
<Root level="INFO"/>
</Loggers>
</Configuration>
(List
appenders are only used for testing...)
The above serverErrorRequestList
would only receive events with a 5xx
status code while errorRequestList
would receive both 4xx
and 5xx
.
My personal use-case is writing 4xx
and 5xx
events into two separate files while ignoring all other events so performance isn't degraded unnecessarily.
Making the WARN
optional sounds like a good plan. I was a bit uneasy about this part myself since it would change the current behavior while all the other changes are "under the hood".
@huxi sounds good.
Any preference how this option should be exposed in the public API?
Current API is RequestLogger.ncsa()
and RequestLogger.ncsa(Logger)
. This would still generate a RequestLogger
that is logging at INFO level regardless of the status code.
One way would be to add RequestLogger.ncsa(boolean loggingErrorStatusCodesAsWarn)
and RequestLogger.ncsa(Logger, boolean loggingErrorStatusCodesAsWarn)
but I think it would be nicer/cleaner to use a different method names instead of a boolean
toggle.
How should I name those methods?
@huxi I would suggest adding a configuration type object to NcsaRequestLogger
that would handle these types of configurations.
Then in the API, would we would retain RequestLogger.ncsa()
(must keep this for compatibility) and add RequestLogger.ncsa(Action<? extends NcsaRequestLogger.Config> configure)
@johnrengelman But that would leak the internal class ratpack.handling.internal.NcsaRequestLogger
into the public API, right?
You can make the config class to be part of RequestLogger
...just would need to be generic enough...
Or It could be a NcsaLoggerConfig
that's part of the public API.
Fundamentally, I think we need to allow the user to specify a function that decides how the request should be logged.
I think we should add this to the RequestLogger
class…
interface Router {
// Receives the formatted log message, and logs it somewhere (e.g. slf4j logger)
void log(RequestOutcome requestOutcome, String msg, Object[] args)
}
static RequestLogger ncsa(Router router)
The router msg
and args
params will be in the Slf4j message format. If the logging is being routed directly to slf4j, then they can just be passed through. If they are going elsewhere, users can use Slf4j's MessageFormatter.
@alkemist The problem regarding the Router
interface is that logging itself is just one part of the problem.
The other part is checking whether logging is actually necessary via isInfoEnabled(Marker)
or isWarnEnabled(Marker)
respectively.
The router can do whatever it wants, including these checks.
This is a more general solution that allows the user to log the message however they want.
I was wondering whether logger.info(...)
/logger.warn(...)
calls should be put into a Blocking
block. Logging can be asynchronous but doesn't have to be...
That gets messy very quickly. I think it's getter to just recommend that people use async logging.
@alkemist That's true re:"router can do whatever it wants" , but the pattern and parameters would have to be created before the Router.log
is even called while they won't if isXxxEnabled(Marker)
already declines. Only status code is evaluated at that point.
Let's add a method to RequestLogger
…
static String ncsaFormat(RequestOutcome requestOutcome) {
…
}
That way people can create custom implementations of RequestLogger
, and use this method to generate the log line when/if they need to and log however they wish. I don't think it's worth complicating the API by returning a format/args tuple here. Instead we should rely on the user to only call this when it is actually going to be logged.
What would be the advantage vs. current RequestLogger.log
and a custom implementation of the interface? I thought ncsa()
was just meant to hide the internal class from the public API.
(I have to admit that I really need to take a nap. I might miss something...)
The useful thing about ncsa()
is that it gives you the log line in the right format. What this change is proposing is separating that from how it is logged. So, if you require something more sophisticated than the current strategy you can impl a custom logger but still get the right formatting.
This would probably be a good use-case for logback-access...
I wanted to implement a lb-access RequestLogger
after this thingy. The problem is that I would first have to refactor some parts of Logback (see LOGBACK-1142).
@huxi I've rearranged things to detangle the formatting from the routing, see here: https://github.com/ratpack/ratpack/commit/454e191645481c632e4d54e525a1c8e39f2a1352
WDYT we should do now? We could either add sugar for the marker scheme that you proposed, or leave it as it is and recommend users use a combination of RequestLogger.of()
and RequestLogger.ncsaFormat()
if they need something more sophisticated.
@alkemist I honestly don't know what to do next.
I initially proposed:
- Use slf4j message pattern for formatting instead of the currently used String.format.
This provides better performance, formatting is performed lazily and it also opens the opportunity to filter log-events based on the message pattern. I'm aiming for two distinct patterns: one including the optional id and one without.
to which you replied:
Slf4j message pattern… yes please.
Then I implemented everything as previously discussed, including tests, including fixes to tests that are mocking the logging (Making them fragile to internal changes, naturally. Those test fixes wouldn't have been necessary if the tests had used e.g. the log4j2 test support classes instead of mocking the logging framework) and sent a pull request.
And then you proposed - and implemented - something that is essentially the opposite of my original suggestion, eagerly formatting the log messages in a Formatter
instance to always log using a String
instead of ever using a message pattern that can be used for filtering and lazy message evaluation.
I obviously failed to explain the rationale behind my proposed changes and don't understand what made you change your mind about it.
@huxi this isn't necessarily done. One thing that I want to ensure about the end solution is that the what to log is detangled from the where and when.
The performance argument for creating the string doesn't wash. There's no way that parsing out the pattern is faster than building the string. If it has function though, we should propagate it in that fashion. Let me change the formatter to pass the pattern and args.
@alkemist It's faster if it's not done at all because a filter declined creation of the logging output based on Marker
, message pattern or arbitrary other rule. In that case, only an Object[]
containing the already existing message arguments is created and later dropped.
With my proposal, the actual log message is only ever actually created if the logging framework decides it needs it. With your proposal, the message is always created and then dropped if the logging framework decides it's not interested after all.
In my actual use case, your proposal would format log messages for 1xx
, 2xx
, 3xx
events (i.e. the vast majority) even though I'm actually only interested in 4xx
and 5xx
messages. And I express my interest in 4xx
and 5xx
in the logging configuration, not the code of the app.
This "expression of interest" can be arbitrarily complex (Groovy code in the logging configuration vs. just using Level
or Marker
) and the only way to prevent unnecessary creation of strings is to use message patterns and creating the message itself only if needed, lazily.
It's not just about the immediate performance of formatting the message (slf4j is pretty efficient in that regard and way faster than String.format
) but also about additional GC pressure introduced by the creation of actually unnecessary strings.
To give another example: a SocketAppender
wouldn't actually format the message at all since events are transferred using message pattern and argument strings.
That's all fine. I'm happy to pass the pattern and args around.
That doesn't obviate the need to separate the formatting from the routing though. What I think we'll do is change the formatter to close over the router, so it effectively acts as an inline transform. This is the cheapest arrangement I can think of.
I don't see how putting pattern and args into the interface would help.
Logging with arguments ["foo", "bar", "foobar"]
will always result in a message like "*foo*bar*foobar*"
since the order of the arguments is fixed. One wouldn't be able to create a result like "foobar foo bar"
in that way. The only thing one could achieve would be to replace the *
above with some other strings.
Is that actually a use case? It definitely wouldn't be an ncsa
entry anymore.
I think I see what you would like to achieve and separation of concerns is usually a good idea but it directly prevents achieving maximum logging efficiency in this case. The concerns are already separated and formatting the log message is supposed to be the concern of the logging framework.
Let me tackle this from a different perspective since I don't really understand that point myself.
What would a String Formatter.format(String messagePattern, Object[] messageArguments)
interface be trying to achieve?
Implementations should perform formatting of slf4j-like message patterns. format("{} {} {}", ["foo", "bar", "foobar"]
should always result in the string "foo bar foobar"
.
Implementations should perform formatting according to different formatter styles. One implementation would understand slf4j-like message patterns while another implementation would format using printf
-style placeholders.
If Assumption A is the case then I could only imagine exactly one reasonable implementation of the interface: using org.slf4j.helpers.MessageFormatter
, since that implementation is already on the classpath.
I see no benefit of using the Assumption A formatter versus just calling the logger method with the exact same arguments as the formatter and let the logging framework perform the formatting lazily.
If Assumption B is the case then the formatter abstraction would be leaky since formatters wouldn't be exchangeable. A request logger would have to know exactly what kind of pattern the used formatter is expecting and would have to provide messagePattern
and messageArguments
in the "correct" way.
Neither Assumption A nor B would enable us to use some other more sophisticated message formatting techniques like "{requestId}" pattern with messageArguments
being a Map<String,Object>
.
What we want to end up with is something like this:
RequestLogger
has-a RequestLogger.Formatter
has-a RequestLogger.Router
interface RequestLogger {
interface Router {
void log(RequestOutcome requestOutcome, String pattern, Object[] values)
}
interface Format {
RequestLogger logger(Router router)
}
…
Format ncsaFormat()
…
}
We'd then bust up your change into a Format
implementation, and a Router
implementation or two. We'd at least ship a simple one that just routes to an slf4j logger at INFO level. We would need to then consider whether the extra functionality you have proposed should be merged into this, or should be a separate Router
implementation.
For completeness, a trivial routing implementation would look like this…
RequestLogger.ncsaFormat().logger((requestOutcome, pattern, args) -> {
if (requestOutcome.getStatus().is5xx()) {
RequestLogger.LOGGER.error(pattern, args);
} else {
RequestLogger.LOGGER.info(pattern, args);
}
})
But that still wouldn't work for the two different patterns MESSAGE_PATTERN_WITHOUT_ID
and MESSAGE_PATTERN_WITH_ID
that are used depending on the existence of a requestId
.
Ah, just took a look at your latest changes. It will create the args[]
regardless of any quick-exit checks like isInfoEnabled(Marker)
/isWarnEnabled(Marker)
.
Such a Router
interface would need two methods:
interface Router {
boolean isProbablyLogging(RequestOutcome requestOutcome);
void log(RequestOutcome requestOutcome);
}
but even then it would have to resolve the used Marker
twice to keep the interface implementation agnostic.
I think that might just be the cost to pay. If someone really wants to do that, they can put something in front of the formatter and “router”. We have to trade something off here.
The thing is that I don't see the big advantage of the router interface that would justify a performance impact. My implementation of the RequestLogger
interface is ideal (to my knowledge*) regarding performance and reduced garbage collection pressure. One could as well argue that someone would just have to implement the RequestLogger
interface if they want something else. It's not a complicated interface at all to begin with.
While I see one use case beside WARN
/INFO
logging (Blocking
logging call vs. just logging) I have a feeling that we are currently overengineering a simple, compact piece of code.
(* benchmarking might find out that a lookup on a Map<Integer,Marker>
with 599 entries is even faster than my int comparisons to resolve the correct Marker
- but I doubt it.)
Performance is only a part of this. We have to trade that off against convenience, but leave enough space in the pieces that someone can take complete control if they require that level of control.
The key thing that my changes are trying to address is decoupling the formatting from the opinions about how it is to be logged. It's reasonable to expect that someone would want to log in NCSA format (and whatever other formats there end up being) but using a different scheme. That should be possible. It wasn't before.
But then you'd need something like this:
interface Router {
// not functional.
// could be split into two functional interfaces, but that would allow invalid combinations.
boolean isProbablyLogging(RequestOutcome requestOutcome)
void log(Message message)
}
interface MessageProducer {
Message produceMessage(RequestOutcome requestOutcome)
}
class Message {
String pattern
Object[] arguments
}
Because you wouldn't want to make routing interchangeable while leaving the generation of the message hardcoded. And you'd like to mix and match different message formats with different routing strategies. Why shouldn't they be decoupled? Nothing would prevent this.
I can only guess what a "convenient" builder method in RequestLogger
would have to look like.
RequestLogger.with(LoggerFactory.getLogger("my.request.logger")).
andFormatter(new NcsaFormatter()).
andRouter(new InfoLevelRouting()).
build()
I don't think that this is a good idea. I'm just walking down this path to the end.
As I said previously:
This would probably be a good use-case for logback-access...
In that case, the whole routing and message formatting would be handled in the logback-access configuration (i.e. a logback-access.xml
file) since it's working directly on an IAccessEvent abstraction.
Such a logback-access
RequestLogger
implementation wouldn't use the above abstractions at all. Neither would some imaginary SNMP implementation use it. They are strictly limited to implementations using an slf4j (or log4j2) logger as their backend and thus shouldn't be part of any public RequestLogger
API. They are implementation details leaking to the surface.
My suggestion was:
Deliver one "perfect" implementation that people would receive by calling the ncsa()
method as before, an implementation that would suit 99.99% of the users "as is", while not changing the public API or the perceived behavior at all.
They would simply gain better performance in Ratpack 1.3 vs. previous versions. Especially, but not exclusively, in the case where logging isn't actually performed. I need exactly this (minimal performance impact in case of no-op) in production and wanted to give those benefits back to the community.
This is what my pull request was about and this is what my pull request achieved, including exhaustive tests. It's not necessarily the end of the story but it was a worthy incremental improvement that could have been merged with no damage done whatsoever.
When I asked
Any preference how this option should be exposed in the public API?
25 days ago, I was already done with all changes except the exposure of the new functionality (logging 4xx
and 5xx
as WARN
instead of INFO
, optionally). Seeing that there was no immediate consensus about how to expose the new functionality, I decided not to expose it in the public API at all, for the time being, and just created the pull request.
And now we are discussing API additions for the 0.01% users that aren't happy with the default behavior, under the assumption that implementing a single method interface (RequestLogger
) wouldn't be convenient enough for them. An API that would directly interfere with my specific need (me being one of those 0.01%, i.e. the target audience of that API) for maximum no-op performance, therefore forcing me to implement the single method interface in my own code base all over again.
Isn't that kind of ironic?
I’d like to enhance
NcsaRequestLogger
a bit:change the
DateTimeFormatter
used to always useLocale.ENGLISH
(not Zone). This is a trivial change, simply adding.withLocale(Locale.ENGLISH);
to the private staticFORMATTER
.A timestamp like
[14/Mär/2016:15:01:10 +0100]
(the output of the current implementation in Germany) is problematic to parse and also a highly uncommon date format. I consider this one a bug.“status-2xx”
,“status-4xx”
etc.) to enable better filtering options in the logger configuration. I mean general markers for all status codes of a given class (Informational 1xx, Successful 2xx, etc.), not markers for specific status codes. "status-4xx" but not "status-404".Use
slf4j
message pattern for formatting instead of the currently usedString.format
.This provides better performance, formatting is performed lazily and it also opens the opportunity to filter log-events based on the message pattern. I'm aiming for two distinct patterns: one including the optional id and one without.
this one might be a bit controversial...
I’d suggest to log at
WARN
level in case of 4xx and 5xx or if status code is not in the range 100-599...write related tests.
Current spec only checks if info is called twice for two requests to Ratpack.
I'd start implementing those changes and preparing a PR after the core devs all had a chance to comment.