Closed CMCDragonkai closed 2 years ago
So the issue is that log data is actually allowed to be an arbitrary object, whereas here the info
... etc, all take ToString
as the type.
It converts them to strings first.
Structured logging benefits from potentially arbitrary keys, and to allow nested structure, not just strings.
This means, the Logger.info
and related functions should be taking a structured object as data
instead of just ToString
. It can still take a string of course if that's what is needed, but anything else will instead become a structured object.
Ok let's not do formatting.json
it should be possible to do just JSON.stringify
as the formatter. We can just special case it.
However as for the schema, the GELF can work: https://docs.graylog.org/docs/gelf#gelf-payload-specification. Not sure about other ones.
Now the only thing left is the fact that data structures should be usable with logger.info
as well.
The structlog
takes a string as the message, and additional arbitrary fields.
With this concept, it's also possible to build a context for further logging. And that just means a prefilled dictionary. Somewhat similar to creating child loggers as we do right now.
https://github.com/bitc/structured-logging-schema#message-and-payload
So it does seem all we need to do is to enable the usage of a payload structure.
logger.info('some message', { some: 'structure' }, JSON.stringify)
Then we have a default structure associated to the log... think of it as "global" context, like time and such. But then further context can be created for a subset of loggers...
And "canonical logging" means that the log gets built up and then output at the end. However that doesn't cover a span. You want to know when it started and stopped too.
It's possible that we combine the structure with the message too, but that might a bit messy. It really depends on whether a log always requires a message or if it can be optional. For example:
logger.info({ message: '...' }, JSON.stringify);
Then a default key can be mandated.
Another example https://www.elastic.co/guide/en/ecs/current/ecs-reference.html. In this case, it does seem that special casing a message is the wrong way to go, if an object is used... you just need to specify your own schema like msg: 'hello world'
.
Won't build in any special schemas. This can be dictated custom by users. PK application can choose to use something custom. Both for formatter and the structured data.
Contexts for spans and traces can be done through having a prefixed object context. This can also be added to a child logger during construction.
Well this was quite easy. All we needed to do is to add:
const jsonFormatter: LogFormatter = (record: LogRecord) => {
return JSON.stringify({
key: record.key,
date: record.date,
msg: record.msg,
level: record.level
});
};
Then when constructing logger:
const logger = new Logger('root', LogLevel.NOTSET, [
new ConsoleOutHandler(
formatting.jsonFormatter
),
]);
logger.debug('DEBUG MESSAGE');
logger.info('INFO MESSAGE');
logger.warn('WARN MESSAGE');
logger.error('ERROR MESSAGE');
And that's it, you get JSON records being output now.
The main issue now is schema standardisation. There's no real one single standard for logging, common ones include graylog GELF, the ECS fields above. Libraries like structlog don't really specify any standards for the payload. OpenTelemetry does have a standard for their properties.
There are some "base"/core properties that these libraries tend to all use:
level
- we use numbers from 1 DEBUG, 2 INFO,... etc.msg
or message
- msg
seems more popular, although I'm not sure if if is actually "necessary"time
or timestamp
- time
seems more popular - always uses ISO8601 formatWe have these pieces of information as well in the LogRecord
.
To set a base-schema, we can use follow ECS's core properties:
https://www.elastic.co/guide/en/ecs/current/ecs-guidelines.html
https://www.elastic.co/guide/en/ecs/current/ecs-base.html
https://github.com/elastic/ecs/blob/main/generated/csv/fields.csv
These are the base fields for ECS:
@timestamp
message
labels
tags
The log level is placed in a subobject: https://www.elastic.co/guide/en/ecs/current/ecs-log.html, intended to be a keyword, not a number.
Hmm to make this elegant, we need to do some changes:
The makeRecord
should create a lazy dictionary, as in properties are functions so they can instead be called on-demand when they are required.
This enables a formatter with special symbols applied to call the relevant function to acquire the value. So the formatter dictates the control flow, and this is set on the handlers.
I noticed that callHandlers
propagate to parent callHandlers
, but the level check isn't done here, it's done on the Logger.log
method, and this means parent log levels don't filter out any child log records.
Notice that makeRecord
is called by log
, which means we should not recreate a log record upon going to the parent. Instead the same record should be kept.
Let's suppose level check occurs in callHandlers
, while log
precreates the LogRecord
, and the LogRecord
is just a bunch of functions. Then it should work well.
The format
function which takes symbols, can then call functions against the LogRecord
. Thing's like
It appears open telemetry considers logging out to stderr as the telemetry/trace exporter to be used only for "debugging". Production exporters all use TCP, GRPC or HTTP 1.1 to send traces to a centralised trace collector like zipkin or whatever.
However this seems to be against how traditional orchestration is supposed to work, where this data goes to STDERR, and then collected by the orchestrator to be sent to centralised loggers. Why did the open telemetry people decide to create their own exporters? Was it to get around STDERR, was it because STDERR logging is considered orthogonal?
It's all done via http or grpc. As if networked sends is better than stderr. Maybe interleaving with stderr isn't right. However for decentralised systems, does it make sense to assume that there is a collector somewhere else? Obviously we would say that the CLI/GUI is a viable collector in order to debug the system. Whereas our hosted testnet may have be pointed somewhere else.
It does seem that stderr/stdout is being deprecated for more "network-centric" observability. I guess it provides more flexibility.
Otel is way too complex. Here's apparently the "basic" example of using Otel and exporting directly to the console: https://github.com/open-telemetry/opentelemetry-js/blob/main/examples/basic-tracer-node/index.js.
Why does it require so many different packages?
const opentelemetry = require('@opentelemetry/api');
const { Resource } = require('@opentelemetry/resources');
const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');
const { BasicTracerProvider, ConsoleSpanExporter, SimpleSpanProcessor } = require('@opentelemetry/sdk-trace-base');
const { JaegerExporter } = require('@opentelemetry/exporter-jaeger');
It seems over-engineered. The spec however might be useful.
For most language specific instrumentation libraries you have exporters for popular backends and OTLP. You might wonder,
under what circumstances does one use a collector to send data, as opposed to having each service send directly to the backend?
For trying out and getting started with OpenTelemetry, sending your data directly to a backend is a great way to get value quickly. Also, in a development or small-scale environment you can get decent results without a collector.
However, in general we recommend using a collector alongside your service, since it allows your service to offload data quickly and the collector can take care of additional handling like retries, batching, encryption or even sensitive data filtering.
So in-process sending is only for adoption as I suspected.
However the community has obviously selected against relying on STDERR, because they view traces not as human readable information, but as information intended for machines to parse, and thus did not bother writing to STDERR, except as an afterthought. This makes sense, because for them once the span-data is written to stderr what happens to it afterwards? It's sort of useless as textual information (not really but one can understand). So there's nothing built to "forward" the stderr logs to a trace receiver or OTLP collector. So even with the OTLP collector it's all designed to send logs over a network connection rather than relying on simple STDERR.
I think what I'll do is examine the spec of opentracing with respect to the log data itself, and then just add it to our existing structured logging, which continues to be send to STDERR. For future sending to other places, that could be done via custom handlers.
For viewing and debugging the traces, if we maintain compatibility with the spec, it should be possible to send the stderr logs directly into a trace collector via HTTP 1.1: https://opentelemetry.io/docs/concepts/sdk-configuration/otlp-exporter-configuration/. Also see https://github.com/opentracing-contrib/java-span-reporter
An example of manually sending a trace:
Get a Zipkin payload to test. For example create a file called trace.json that contains:
[
{
"traceId": "5982fe77008310cc80f1da5e10147519",
"parentId": "90394f6bcffb5d13",
"id": "67fae42571535f60",
"kind": "SERVER",
"name": "/m/n/2.6.1",
"timestamp": 1516781775726000,
"duration": 26000,
"localEndpoint": {
"serviceName": "api"
},
"remoteEndpoint": {
"serviceName": "apip"
},
"tags": {
"data.http_response_code": "201"
}
}
]
With the Collector running, send this payload to the Collector. For example:
$ curl -X POST localhost:9411/api/v2/spans -H'Content-Type: application/json' -d @trace.json
Opentelemetry details are moved into new issue #15.
Because the telemetry schema doesn't actually require a mandatory message, it makes sense to change our logging methods to take a string or object. If string, then that's assumed to be a regular msg
, if object, a message is not necessary and can be empty.
It's up to users to provide the necessary message with msg
and appropriate logger formatter for their application usecase.
I imagine tracing would be built on top of js-logger, so this library is kept simple. Tracing would be PK-specific.
This is now done. Going to update the README.md
with some example usages.
Description
This introduces structured JSON logging. As an intermediate step towards tracing and machine parseable data that can be used to construct contexts and traces.
We already have a
LogFormatter
type. Any thing that produces a string given aLogRecord
is sufficient.Usually we construct formatters with the
formatting.format
function, which takes template strings and produces a string.We can provide something like:
That produces a formatter function. The
json
symbol simply means the entire data structure of theLogRecord
is captured.The other issue is the schema of the keys of the JSON record, we should align it with some standard. Possibly close to opentracing standard or structlog standard.
Issues Fixed
Tasks
jsonFormatter
as a default JSONLogFormatter
trace
tostack
to avoid confusion with tracing in #15stack
andkeys
to be lazily created if the renderer calls themLogData
for arbitrary data to fit into the logLogData
enables lazy values by wrapping values in a lambdajest.useRealTimers();
to close off the timer mocking in testsmsg
is no longer mandatory, if is not passed, themsg
will be emptyFinal checklist