Closed yurishkuro closed 8 years ago
Meant to reply earlier.. here it goes!
Main thing is that this moves more explicitly into runtime checks (as opposed to method signatures) for use cases such as dispatch. This moves the extensibility away from the api and into the implementation. This works better for languages that have pattern matching
Pardon the crap stuff below, but it would look something like below to dispatch based on keys in scala and clojure. Note in some cases, ex clojure, the input isn't required to be an association (ex it could be a struct):
Ex.
log match { case Map("level" -> "info") => doInfo
(let {:keys [msg level] :or {level "info"}} log)
From an interop with java language perspective, there's a few follow-ups that we'd need to consider if we went with a pattern approach as opposed to method dispatch:
My biggest conceptual concern is that this "kicks the can down the road" (or "kicks the can up the stack"??) as far as the semantic distinctions between these various logging use cases. I.e., it weakens the "instrument once" Tracer interchangeability story.
FWIW / side-note: what you are suggesting here is essentially equivalent to what go-kit does: https://github.com/go-kit/kit/blob/master/log/log.go#L18
@adriancole
What is the type signature? Would the input be Map<String, Object>?
Yes, but Map is an alternative signature, the primary one could be a variadic method, or a set of overloaded methods like of(...)
in https://github.com/google/guava/blob/master/guava/src/com/google/common/collect/ImmutableMap.java. A builder pattern can also be used. Any way we turn it, Java simply doesn't have an efficient way of passing variadic arguments without memory allocation, except for overloading the log
method with 2, 4, 6, ... arguments.
Maybe we should show how the common dispatch cases might be implemented?
First, the dispatch does not need to happen in Java. If we are talking about tracers, then all they need to do is pack the set of key/values from the log method into a span and let the backend deal with interpretation. The example of dispatch (with some pseudo-code):
interface LogTagHandler {
handle(Object value, LogData log);
}
Map,String, LogTagHandler> handlers = ImmutableMap.of(
"level", new LevelLogTagHandler(),
"payload", new PayloadLogTagHandler(), // etc
);
Span log(Object... tagsAndValues) {
LogData logData = new LogData();
for( tag, value in tagsAndValue) {
handler = handlers.get(tag.toString());
if (handler != null) {
handler.handle(value, logData);
} else {
logData.appendCustom(tag, value);
}
}
}
But again, this only needs to happen way down the line in the collection pipeline. The actual tracer implementation, where performance matters the most, does not need any dispatching.
@bensigelman
My biggest conceptual concern is that this "kicks the can down the road" (or "kicks the can up the stack"??) as far as the semantic distinctions between these various logging use cases. I.e., it weakens the "instrument once" Tracer interchangeability story.
Right now we have a pretty odd looking set of logging methods, covering only an odd subset of use cases. And we've been arguing for months before we even got those in. My concern is that we'll be stuck for a long time if we try to come up with clean semantic distinction, via method signatures, to the use cases outlined above. I think we'll have a much easier time converging on a common set of logging tags, just like we did for span tags, while having a simple and generic logging API. What I also like about it is the symmetry with the span tags. The API is straightforward and can be fixed for a long time, without worrying that v 1.0 tracer is incompatible with v 1.1 user code because we happen to change some argument in the log method. Lastly, strictly speaking, a structured logging API does not require any additional semantic distinctions, e.g. if you have a backend (like ELK) that properly indexes structured logs, then the end users can always find what they are looking for without defining semantic meaning of the tags. That's not to say that we shouldn't define some standard tags like exception
, event
, level
.
What is the type signature? Would the input be Map?
Yes, but Map is an alternative signature, the primary one could be a variadic method, or a set of overloaded methods like of(...) in https://github.com/google/guava/blob/master/guava/src/com/google/common/collect/ImmutableMap.java. A builder pattern can also be used. Any way we turn it, Java simply doesn't have an efficient way of passing variadic arguments without memory allocation, except for overloading the log method with 2, 4, 6, ... arguments.
Ah ok.. so this could well end up similar to how finagle works (depending on impl). For example, they have an "Record" type which encloses a few things needed in general, like traceId, timestamp, and annotation. The latter is what's extensible. For example Annotation could be ClientRecvError (error: String) In this case, the case class (no pun intended), is what's allowing the extensibility and the case class is also what allows for dispatch based on potentially multiple parameters.
Maybe we should show how the common dispatch cases might be implemented?
First, the dispatch does not need to happen in Java. If we are talking about tracers, then all they need to do is pack the set of key/values from the log method into a span and let the backend deal with interpretation. The example of dispatch (with some pseudo-code):
interface LogTagHandler { handle(Object value, LogData log); } Map,String, LogTagHandler> handlers = ImmutableMap.of( "level", new LevelLogTagHandler(), "payload", new PayloadLogTagHandler(), // etc ); Span log(Object... tagsAndValues) { LogData logData = new LogData(); for( tag, value in tagsAndValue) { handler = handlers.get(tag.toString()); if (handler != null) { handler.handle(value, logData); } else { logData.appendCustom(tag, value); } } }
I'd note that there's collaboration between keys in your earlier example (ex where log level is also present). This is what makes this sort of dispatch more tedious in pure java vs something like a scala case class.
But again, this only needs to happen way down the line in the collection pipeline. The actual tracer implementation, where performance matters the most, does not need any dispatching.
Very good point. What you are saying is that these events are placed into a queue or something and the reporting thread is the one that bears any overhead involved in parsing them etc (ex the calling thread only needs to dump it into a queue, which is practically overheadless)
What you are saying is that these events are placed into a queue or something and the reporting thread is the one that bears any overhead involved in parsing them etc (ex the calling thread only needs to dump it into a queue, which is practically overheadless)
Actually, what I meant is that even reporting thread does not need to do any processing of the logging tags. Let's say we use Thrift for out of process span reporting, so I would define something like this:
struct Span {
...
9: list<Log> logs
}
struct Log {
1: int64 timestamp
2: list<LogTag> tags
}
struct LogTag {
1: string key
2: string value
}
So the only processing that happens in-process and affects the end user performance is:
span.log(...)
method needs to be called, usually requires 1 memory allocation for varargsLog
thrift struct without interpreting themOnce the span data is shipped out of process to the Collectors, only there do they need to start interpreting the tag keys.
I'd note that there's collaboration between keys in your earlier example (ex where log level is also present).
I would assume that what the Handlers do is simply populate a wider LogData struct where special tags are pulled out as top level fields, i.e. LevelTagHandler would set a level
field on that wider struct. So in practice there's little interaction needed between the handlers.
I'm coming around to this proposal. I do think it "kicks the can up the API" per my previous note, but – even if that's true – there is value in making the lowest level of the logging API completely general.
I want to think through the implications for OpenTracing users who just want to do the simple thing (e.g., logging informational messages). Using Go as an example, I can imagine something like
import "github.com/opentracing/opentracing-go"
import "github.com/opentracing/opentracing-go/logging"
func foo(sp opentracing.Span, bar int64) {
// All of the following are equivalent.
//
// Option 1: Platform-idiomatic formatted messages:
logging.Infof(sp, "bar=%v", bar)
// Option 2: Canonical logging keys:
sp.LogKV(
logging.MessageKey, fmt.Sprintf("bar=%v", bar),
logging.LevelKey, 0,
)
// Option 3: Full control at the lowest level of the API:
sp.LogKV(
"message", fmt.Sprintf("bar=%v", bar),
"level", 0,
)
}
I can get enthusiastic about "option 1" above and the idea of logging.Infof
(and probably similar things for events, exceptions, etc). IMO options 2 and 3 don't pass the "OpenTracing should be easy and lightweight" sniff test, though I see the value of those layers for the sake of generality.
Anyway, if we can agree that OpenTracing should try to offer things like option 1 above, I see the merits of making the Span
API per se just pure key-value (and doing higher-level logging stuff in some other package / class / location).
Side note... @yurishkuro said
So the only processing that happens in-process and affects the end user performance is:
- span.log(...) method needs to be called, usually requires 1 memory allocation for varargs
- background thread packs the tags into the Log thrift struct without interpreting them
I like the idea of log payloads that can be arbitrarily large with the caveat that the tracing impl will not necessarily serialize all of them... i.e., the amount of payload that makes it downstream to the tracing system is best effort, perhaps beyond some sort of guaranteed minimum. In that case there is still key:value computation that needs to happen before the data gets out of the process, but I readily admit this is not something that is necessary, it's just something that can be done in order to make some attractive feature (like large structured logging userdata payloads) feasible.
Thanks for clearing this up, basically this is moving most logic out-of-process ie to the collectors. In fact anything structural is only enforced after it leaves the process.
This is a variant of @mattheath's phosphor, except taking less responsibility in the api or transport form. In phosphor's case (Annotation), you can timestamp key/values in an annotation. However, there's certainly fields in the struct. This helps avoid the cost of serializing well-known keys and making everything subject to interpretation.
https://github.com/mondough/phosphor/blob/master/proto/trace.proto#L5
There are likely timestamped map shippers out there. Ex some general obs tools do things like this. However, I wonder if it isn't too generic, as you could indeed write a spec like this in any project. It is not more or less helpful here since it has basically no opinion except time, which is not that opinionated.
@bensigelman I am not that much in favor of infof
style of logging, primarily for these two reasons:
{}
or {n}
or even %s
style of formatting. I don't see how we can possibly agree on one of those.infof
style logging provides many benefits over structured logging. Practically any infof
-style log message can be rewritten in the form (msg, k1=v1, k2=v2...), e.g. log("cache hit", obj_id=123, ttl=123)
. Structured logs can still be quite readable (cache hit: obj_id=123, ttl=123
), but they are much better suited for machine searching and aggregations. By exposing only structured logging API we nudge the users to write better logs imho. Ironically, your 3 examples illustrate the issue perfectly - they would've been much better off written with bar
being a tag instead of a formatted message.@yurishkuro re infof
-style logging: I don't have religious feelings about it, I was just using infof
as an example of something that would make for awkward user code (without a wrapper function like logging.Infof(span, fmt, args...)
).
On that note, for this particular (github) issue, I don't need to know whether we're going to support infof logging specifically... but I still want to advocate for convenience functions that help with the (standardized) structured logging key:values. E.g.,
import "github.com/opentracing/opentracing-go"
import "github.com/opentracing/opentracing-go/logging"
func foo(sp opentracing.Span, bar int64) {
if err := something(...); err != nil {
// Option A:
logging.LogException(sp, err)
// Option B:
stackPCs := make([]uintptr, kMaxDepth)
frameCount := runtime.Callers(1, stackPCs)
stringFrames := make([]string, frameCount)
for i := 0; i < frameCount; i++ {
func := runtime.FuncForPC(stackPCs[i])
stringFrames[i] = ... // format func into a string.
}
sp.LogKV(
"error", err.String(),
"stack_trace", stringFrames,
)
}
}
The "Option B" story gets even more unmanageable if we try to provide structure within each stack frame (e.g., separating the symbol name and func:line info).
Anyway, my point is not that we should necessarily support infof
, but more that we should try to create platform-idiomatic wrappers for common logging use cases.
@bensigelman I have no objections to util functions that eventually delegate to a single LogKV
-style method. The only feature that is hard to support with such delegation is if we wanted infof
-style util methods with deferred evaluation. If you're ok with util methods using fmt.Sprintf
to resolve to a single message string, then we're good.
It's time to move this forward... the basic idea is straightforward, and really most of the interesting part is just deciding on the right idiomatic interface for each language.
I have three additional requirements to propose, though... discussion/dissent about these is welcome:
log
method that accepts a timestamp (and not as a key:value pair, but as an explicit parameter)Part of me wants to wait on the functions/closures, but at least thinking about them might help us pick better typedefs and so forth.
I should also point out that 1. and 2. above are already handled by OT.
@bensigelman
[]LogData
to avoid locking on the mutex on each LogData
?There's also one migration concern. We probably want the most generic k/v log method to be called Log(...interface{})
, but we need to make sure it does not clash with the possible existing usage of Log(LogData)
(e.g. by dropping the type LogData
)
@yurishkuro
why 1? that is, why not via k/v?
My argument: the timestamp is a required argument and thus can be split out... it's just a required argument with a default of now()
:)
is 2 means for something like []LogData to avoid locking on the mutex on each LogData?
Yeah, just a fast-path for translations. I guess "requirement" is the wrong word... we could always add it later since it's an optimization. But something to think about as we decide on types and so forth.
Is 3(ii) meant to capture JSON-like structures? If so, map isn't enough, JSON may contain arrays.
I could see it either way. If we support arrays, are they allowed at the "top level"? If so, how to distinguish between interleaved KV and an array?
There's also one migration concern. We probably want the most generic k/v log method to be called
Log(...interface{})
, but we need to make sure it does not clash with the possible existing usage ofLog(LogData)
(e.g. by dropping the typeLogData
)
I agree that there's a question here, but I think we can/should address Go stuff in the Go PR... for this issue I was hoping to focus on pan-language semantics.
Some notes from an in-real-life conversation with @yurishkuro:
People who care about this issue should check out https://github.com/opentracing/opentracing-go/pull/108
I think that having only key:value pairs over-generalizes this. By only kind-of-formalizing common key names, trace implementations will not easily be able to build upon a minimum set of required information. If OpenTracing is used in multiple 3rd party libraries (e.g. web frameworks, ORMs, ...) they might all end up using different values - or even different data types - especially for level
.
My main question is how you intend this feature to be used:
If so, filtering will be a very important requirement. As soon as you have many logs, you want to be able to filter them before they are sent away (to reduce traffic). This is very similar to the general "sampling" but on a per log basis. Imagine having multiple libraries using OpenTracing to log messages. Some of them might be very verbose, some not. You might not always want to get all messages from all libraries.
Formalizing on these common fields would make such global filtering easier:
level
: I don't know of any logging system that doesn't have levels. Agreeing on a common set of values is hard but it's a must have if different components/systems use this functionality. If OpenTracing is built into 3rd party library (e.g. web frameworks, ...) they must all use the same values.component
: To allow filter combinations based on component+level (but maybe this could also be derived from the enclosing span!?I also think the following should be separate fields/parameters in the API:
timestamp
: Time is a very important concept in tracing and everything should be stamped. If not, how is ordering guaranteed? And if everything has a timestamp, why agree on a tag name instead of just having it a fixed field?exception
: In C# (and I guess Java as well), exceptions are big objects with a lot of useful information. If you limit the data types to string/number/bool again, users can no longer pass raw exceptions and have the trace implementation figure out a good way to extract useful information. Instead, everybody has to create a helper method which converts an exception into either a list of key value pairs (e.g. "exception.message", "exception.type", "exception.stacktrace", "exception.***" or use just one "exception" key and serialize the details into a single string - which probably can't be indexed well.eventId
/eventName
: something that identifies the entry - similar to operationName
. This would allow tracing systems to know which text to display in case of many key:value pairs. Also, from an API usability point of view, having only key:value pairs brings a weird difference between creating regular spans and logging - e.g:
StartSpan
/ Finish
, but a tag on the proposed Log
method? StartSpan
and in SetOperationName()
- why isn't it a tag also?fyi: this is how the new logging standard from ASP.NET Core (and many other .NET Core based libraries) defined its generic log method:
void Log<TState>(
LogLevel logLevel, // a fixed enumeration
EventId eventId, // a numeric identifier for the event
TState state, // a payload object for structured logging - can have many fields/values!
Exception exception, // loggers decides which fields to log etc.
Func<TState, Exception, string> formatter // a method which creates a "readable" version of state - eg for console output
);
They then offer convenience extension methods, which are added to the existing ILogger type without having loggers to implement them - e.g.:
void LogInformation(EventId eventId, Exception exception, string message, params object[] args);
void LogInformation(EventId eventId, string message, params object[] args);
void LogInformation(string message, params object[] args);
@cwe1ss sorry to be brief, happy to elaborate...
The idea here is to settle on the lowest level of the API. It may be that we eventually support higher-level logging concepts built on top of this foundation, but I would like to keep the Span API out of the business of coming up with a "CISC"-style (sorry for the 90s processor analogy) logging portfolio that will please all parties... it would quickly eclipse the rest of the Span API and would be – IMO – a distraction.
We would definitely want to standardize on certain keys, much like we have with certain tags a la http://opentracing.io/data/
Does that set your mind at ease at all?
yes, I definitely agree that logging is a completely different beast. That's why I think you have to be very careful with how you advertise this. It should be clear whether it should be used for "Order ABC processed for customer 123"-like log messages that would otherwise end up in ElasticSearch.
That's why I personally liked the "LogEvent"-naming. I understand this as a method that allows me to add tracing-relevant checkpoints to my span and nothing else.
Regarding the API, I think ease-of-use for users should be the first priority. Required fields (if there are such - like operationName for StartSpan) should be clearly visible as such.
fyi - regarding key:value pairs: C# has a nice feature called "anonymous types" which would allow users to do something like this (which looks a bit nicer than creating a dictionary):
var tags = new { Level = "info", KeyXY = "SomeValue", Message = "SELECT * FROM Customers" };
span.Log(tags);
However, this has quite a few drawbacks:
So I don't think that it would be a very good fit for this scenario.
If there are no complaints by mid-day on Monday I will start dealing with the repercussions of these various API sketches/RFCs... Again, just to have everything in one place:
Go: https://github.com/opentracing/opentracing-go/pull/108 Python: https://github.com/opentracing/opentracing-python/pull/36 Java: https://github.com/opentracing/opentracing-java/pull/47 Javascript: https://github.com/opentracing/opentracing-javascript/pull/53
(I don't think anyone but me cares about Objective-C :)
My largest concern with this change is that it is to generic and therefor, as @bensigelman said earlier
...weakens the "instrument once" Tracer interchangeability story.
To me the simplest form of logging contains a message string and a timestamp. There is a discussion why the timestamp should be a separate parameter, and I would prefer to have the message treated the same way. To this we could then add metadata for semantic logging, loglevel etc., in one generic form or another. The purpose of this would be that even if there are different ways of adding extra payload, at least a message string and a timestamp could be interpret by all tracer implementation.
With that said, I like the way the previous payload parameter now is formalized into a key-value array.
To me the simplest form of logging contains a message string and a timestamp.
I agree that it's the simplest, but I disagree that it's the most useful. In large scale distributed systems, logs are not written for humans to read, they are primary meant for machine processing. If the only thing that you have in the log is a scalar string message, the KV API allows you to log it just as easily, as log(event='your scalar string')
(Note the use of event
instead of msg
- a scalar string can only mean an event at a point in time). In most cases people want to log not just events, but some state, and in the traditional logging API the only way to log a state is with formatted strings like log("Found something=%s", xyz)
, which is easy to read but hard to machine-process. An equivalent KV format log({"event": "found", "something": xyz})
is not much harder to read, but significantly easier to machine-process, aggregate, alert on, etc.
The KV API we're proposing does not preclude a "simple message with a timestamp", and does not even make it significantly harder to log, but by being more general it allows a lot more expressibility for state logging.
This is done for Java, Go, Python, Javascript, and ObjC... C# and C++ are not stable yet and thus shouldn't block this issue. Thanks all, and esp @yurishkuro for making the suggestion in the first place.
Related to earlier discussions, including #30, #90.
I would like to propose to replace the current
log_event
andlog
methods with a single structured logging method:Why
This API addresses several different requirements that were previously raised:
log_event
can be achieved bylog(event="..", ...)
log(msg="...", ...)
timestamp
tagpayload
taglevel=info | error
exception=ex
tagImplementations
Specific API will depend on the concrete language features.
Python
Go
Can use either a variadic function with alternating keys and values, similar to go-kit's API:
or a zero-allocation approach used in https://github.com/uber-go/zap:
Java
Best format I can think of is a variadic method with alternating keys/values, similar to Guava's
ImmutableMap.of(k1, v1, k2, v2, ...)
. It can also be overloaded to accept aMap<String, Object>
.Javascript
Pass a dictionary
cc @bensigelman @dkuebric @michaelsembwever @adriancole