openfaas / of-watchdog

Reverse proxy for STDIO and HTTP microservices
MIT License
259 stars 115 forks source link

Watchdog should be passthrough for function process for logging #75

Closed cconger closed 3 years ago

cconger commented 4 years ago

Expected Behavior

In the cases where invocation is not utilizing the stdout/err pipes for communication with the watchdog the watchdog should not modify or impose structure on the logs passing through from these event streams. This allows me as an application author to manage how my logs are presented in order to be processed downstream, for example if I wanted to use structured logging such as with JSON or CSV. Ideally the Watchdog should not mangle the output of my application except to make it available to the platform log driver.

There are some logs that need to be emitted from the watchdog itself: • Errors parsing config • Forked process termination • Issues reading from the pipe files • Errors communicating with metrics services

These of-watchdog logs should be able to be easily and optionally be skipped or selected on by log parsers/shippers so that I can have an unobscured event trail of logs as emitted by my application.

Current Behavior

There are several undesirable behaviors that exist in the current implementation that highlight the mangling that is currently being done by the Watchdog.

Lines longer than 256 bytes are split

There is a hard limit on the number of bytes that can be in a single line from the wrapped function process. For instance a message of 300 bytes emitted from the underlying function would be split to two lines of 256 bytes and the remaining 44 bytes below. Even worse it can be split across output from other go routines (such as the watchdog itself).

Example:

2019/08/09 00:01:42 stderr: 2019/08/09 00:01:42 {"abexi":"yxgdndacmbhgwhadjnba","aclra":"dbgmdiytcwloabveajbe","aczai":"nlqlibaatlyhdaopovfo","aereu":"nuzjqxmzotarlutmygms","afgie":"hlgizmhgzptxtfrgkaqq","ahugb":"qxifgcyvgcazaefizhgw","ajxwk":"afikzruuywsuwkobbuor","akgza":"altlhtuzh
2019/08/09 00:01:42 GET / - 200 OK - ContentLength: 2
2019/08/09 00:01:42 stderr: oidz","zxbqo":"tukdfklasvyafstrdpoj","zynxg":"bfrwcxacobgabksdrjdi","zzawj":"lvncjvkyrwrlixxqcahq","zzhms":"ouyykvnikbudiryewvos","zzttv":"wvtuerakfsxlplgaftry"}

Stderr and Stdout are redirected to Stderr for HttpRunner

These files are being piped into the same output file Stderr with a prefix. This is somewhat surprising since for the HttpRunner both of these files are written to, however Stdout does not contain any output from my wrapped application.The prefix itself isn't that bad as it can allow me to easily determine if a log line is from the wrapped process or the of-watchdog itself. However combined with the hard limit of 256 bytes splitting lines up, if I have a structured log line it becomes extremely difficult to parse these structures consistently.

Example:

stdout: My function's original log message to stdout
stderr: My function's original log message to stderr

Null bytes and extra new lines in output.

The current implementation is calling Println and passing a buffer of length 256. This is resulting in all of the bytes to be written out and causing double new lines in those cases as a newline is being copied from the original buffer and the another one is applied to the end of the byte buffer and if you're using a viewer that doesn't strip null bytes (such as a tool that writes to a file) those null bytes are cluttering up the output.

Example log file:

2019/08/26 23:11:21 stderr: Starting application server
������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

Possible Solution

I originally proposed a solution involving a line based tokenizer from bufio as a way to attempt to output align the mirrored logs with those of the wrapped function application. I believe this addresses the problems the best in that log lines are persisted from the underlying wrapped application and re-emitted with the common prefixing done by the watchdog as they were originally written.

Another possible solution is to increase the read buffer to be larger than expected output via making it a tunable parameter to the of-watchdog. So for applications that are doing verbose structured logging a much larger buffer can be provided versus something closer to the default provided now for a terse application.

Steps to Reproduce (for bugs)

I tested this using a contrived http mode go function named chatty-fn that just logs requests that it receives and a bunch of other extra information. I built the chatty-fn docker container and included different watchdogs to wrap it. I then boot the docker container directly, and issue network requests directly to it.

# Within chatty-fn
docker build -t chatty-fn:latest .
docker run -p 8080:8080 chatty-fn:latest > stdout.log 2> stderr.log
curl -X POST localhost:8080 -H 'Content-Type: application/json' --data '{"hello": "world"}'

Context

I have been attempting to add structured logging to an application running in a container. Using a tool like filebeat to parse, ship, and then extract those structured logs into a searchable system. Due to the small buffer size currently implemented it makes it difficult to handle these json log lines that extend beyond 256 bytes.

Your Environment

stefanprodan commented 4 years ago

I would make the buffer size configurable via an environment variable that can be set in the function definition.

alexellis commented 4 years ago

+1 @stefanprodan

I've found the PR I mentioned on #72, #37 attempted to add configurable buffer sizes

I think that this is the mechanics of what we'd want:

https://github.com/openfaas-incubator/of-watchdog/pull/37/files#diff-194c7aaf3d69dbfd1dc1fb6c6837cd18L72

What is confusing about the old PR, is that it looks like I wrote it, but @canerbasat proposed it in a PR.

In any case, I think I would be open to a single buffer value or potentially two different buffer size values for stdin/stdout.

Alex

stefanprodan commented 4 years ago

I propose a single env variable for both stdout/err named log_buffer_size.

LucasRoesler commented 4 years ago

the proposal mentions a couple of things

1) buffering (or not) the message and the buffer size 2) the merging of the function stderr and stdout by watchdog 3) watchdog specific logs are interwoven with function logs

for topic (1) I am also in favor of a new env variable log_buffer_size. Increasing the default buffer might also make sense. If a true streaming solution existed, I would prefer that. Also, it is worth noting that io.Copy allocates a 32kb buffer https://golang.org/src/io/io.go?s=12796:12856#L391 so using io.Copy is also a buffered solution.

for (2), I did some digging into the k8s and Swarm docs. Both collect stderr and stdout automatically, so I think it would be fine for watchdog to send the stderr to stderr and stdout to stdout without any prefix.

for (3), in the light of (2), perhaps we can partially address this by having watchdog send all of it's own logs to stderr only

cconger commented 4 years ago

I'm for adding the log_buffer_size param. I would also enjoy both stderr and stdout to be passed to their appropriate outputs sans prefix. I think these two changes make it possible for the end user to control their experience with logging largely.

I want to experiment a little bit, but I believe on posix that the reads will be usually aligned to pipe fd flushes so a sufficiently large buffer affords control to the writer for chunking the output. I think therefore increasing the default buffer size to something like the stdlib uses 32kb (io.Copy) to 64kb (for the Scanner) makes sense in addition to user level control.

cconger commented 4 years ago

I've submitted a new PR that adds this parameter, however while playing with it more I remain a relatively strong proponent of the Scanner approach in #72. It affords a core library solution to a dynamically sizable buffer, with system agnostic handling of new lines. I would however want to update it to write to the appropriate out file (Stdout => Stdout / Stderr => Stderr)

The benefit one gets from using the scanner is that a line as generated from the wrapped process will be guaranteed to be written as a line in the resulting log output if that line is less than 32Kb. You can potentially still get log lines interspersed from the watchdog in your lines from the function process however they'll only ever be interspersed between true lines as generated by my application.

A dichotomy of approach seems to be you can either try to be a true byte passthrough (what I've tried to capture in #77) or you can aggregate the lines which seems more appropriate if the watchdog will be emitting logging during operation. This affords a guarantee on the contents of any given line up to the golang buffer cap (32Kb) and affords a way to filter contents from the watchdog by log parsers.

It also is more reasonable for me as a consumer to properly size what size a log line might be instead of me guessing what a healthy buffer size might be in the case of a GC pause or latency reading the file descriptor from the watchdog. So perhaps a Scanner built on top of a tunable buffer size from the param makes more sense.

cconger commented 4 years ago

Hey there, I'd appreciate some insight into whether the initial assertion of the issue matters to this project. Should the wrapped application function be the responsible for the shape and format of logs? It seems like an obvious win for portability and flexibility to me but if that view isn't held I understand why I seem to be pushing in a different direction than you all.

alexellis commented 4 years ago

Hi @cconger

I am concerned about changing the behaviour in this way (scanning for \n), but I need more time to dedicate to your understanding problem and proposal. I have been having to prioritise other work and features.

An alternative to using the of-watchdog, is to use a Dockerfile, you can still use the templating system if you want.

What are your thoughts about this blog post? If you're using the http mode, then it will be relatively straight-forward for you to have complete control over STDIO.

https://www.openfaas.com/blog/stateless-microservices/

Alex

alexellis commented 4 years ago

The last comment from Lucas/Stefan/myself was about adding log_buffer_size.

Would you like to send a PR for this? It will be easy to merge, or do you feel that you need the line-scanner for structured logging to work?

alexellis commented 4 years ago

for (2), I did some digging into the k8s and Swarm docs. Both collect stderr and stdout automatically, so I think it would be fine for watchdog to send the stderr to stderr and stdout to stdout without any prefix.

@LucasRoesler can you explain the context, or what you mean by this? What has log collection got to do with the use of log.Printf(), which prints a prefix? Pass-through of stdio sounds like it's essential for structured logging.

cconger commented 4 years ago

I have a pr of #77 that does the log_buffer_size. It however raised interesting concerns about preserving lines from the wrapped process and having a consistent prefix for log lines which @LucasRoesler noted in his comment when I removed log.Println. Because I do not easily see how you can have both a guaranteed prefix on each new line (ideal) and also have a 1:1 mapping of lines in of-watchdog's output and the source application's (also ideal) unless you do buffering and tokenizing.

alexellis commented 4 years ago

Is there a way that it makes sense to preserve the existing behaviour, and add your tokenised scanner, so that the two code paths work separately?

cconger commented 4 years ago

Certainly, it can be flagged behavior. However, I'm unsure why anyone would prefer the existing behavior. I say that not out of judgement but I just do not see the use case and would enjoy accounting for it.

alexellis commented 4 years ago

I guess for backwards compatibility, because this could be a breaking change. Some functions may not write a new-line, in which case, how would the scanner work?

alexellis commented 4 years ago

This approach looks fine to me https://github.com/openfaas-incubator/of-watchdog/pull/73

If we wanted to have some level of backwards compatibility then perhaps you could add something like?

bindLoggingPipe - without scanner bindScanningLoggingPipe - with \n scanner

Given this rather bold statement, I am starting to wonder if I am not understanding something obvious?

I'm unsure why anyone would prefer the existing (code) behavior.

cconger commented 4 years ago

In the case where no newlines were sent the scanner would flush the entire buffer at process conclusion or once the buffer's limit had hit 64Kb.

While I appreciate that its existing behavior its odd to me that someone would expect to write 256 bytes exactly to an output and count on that getting mapped to a line based textual output. In the case that its some streaming protocol thats not textual/line delimited it is already getting mangled by the transformation to line delimited 256 byte blocks and therefore difficult to use by anything consuming it without having to re-extract it which was happening in my structured logging example.

cconger commented 4 years ago

I'm happy to decouple the 2 proposals that are inside of this issue if it helps ease concerns about change.

  1. Remapping the output file descriptors (using both stderr and out)
  2. Changing output slurping/tokenization from the wrapped file descriptors.

I think 2 is the most important, largely because 1 gets undone by the Kubernetes log driver.

alexellis commented 4 years ago

73 is merged now. I like the solution and the design. It was very refreshing to see the testing and output too, that makes it a lot easier to merge.

I've asked on the PR if you could do a smoke test with two of the more popular templates. Perhaps if that looks good, you could submit a PR to bump of the version in their respective Dockerfiles?

alexellis commented 4 years ago

Could you summarise what remaining changes you feel need to be made?

cconger commented 4 years ago

I feel that the issues I was experiencing have been resolved. I feel comfortable writing to the log in an application controlled way and easily parsing the resulting output with existing tools. I believe we've addressed all three of the original issues raised in the body. What I am working on: • I'm following up on your comment on #73 to be able to turn off prefixes (and log flags) via an argument. • I will follow up on bumping the Dockerfile of-watchdog for the tested templates.

Happy to proceed how you would like, either tie those changes back to this issue, or open new issues to cover those individually and close this one.

saginadir commented 4 years ago

@cconger I have stumbled upon this issue, and I haven't even realized that the log is being split, so thanks for raising that.

Following your conversation, I have stumbled on alex's comment, and this is exactly the issue we are facing at the moment.

Our functions are deployed on k8s cluster which has it's logs scraped and sent upstream to log parsers to be indexed and become queryable. Of course for that issue, we must have the logs in their original JSON structure without the prefix.

Where does this issue stand at the moment, and can I assist?

alexellis commented 4 years ago

@saginadir please can you open a new issue for your use-case? @cconger is currently filtering out the prefix with his log aggregator tool.

Would also be helpful if you could join Slack and maybe tell us a bit about your use-case to help prioritise.

Alex

alexellis commented 4 years ago

/msg: slack

derek[bot] commented 4 years ago

-- Join Slack to connect with the community https://docs.openfaas.com/community

saginadir commented 4 years ago

@alexellis Will do.

Also I requested invitation to the slack channel.

alexellis commented 4 years ago

How are we doing with this now @cconger @saginadir @sargun ?

saginadir commented 4 years ago

@alexellis For my solution, at the moment I forked the watchdog and made the specific changes I needed

alexellis commented 4 years ago

I think that the Netflix guys just used a prefix rule for their logging so that it worked with their ingestion pipeline. Maybe that'd work for others too?

Do you have your fork public for us to compare?

saginadir commented 4 years ago

Yes, that's my fork https://github.com/openfaas-incubator/of-watchdog/compare/master...saginadir:master

alexellis commented 4 years ago

Thanks.

On this line, how do you ensure that the code doesn't write plain-text? https://github.com/openfaas-incubator/of-watchdog/compare/master...saginadir:master#diff-42b34d1af16e4b2190c38ad0ec0e0f54R18

saginadir commented 4 years ago

The logger that logs the messages from the function simply pipes the message from the function, and I make sure the function logs JSON.

Can be seen by replacing this:

logger := log.New(output, log.Prefix(), log.Flags())
...
logger.Printf("%s: %s", name, scanner.Text())

With this:

logger := log.New(output, log.Prefix(), 0) // removing flags to remove timestamps
...
logger.Printf("%s", scanner.Text())

And I print the status strings in JSON instead of plain log, which is done by replacing this line:

log.Printf("%s %s - %s - ContentLength: %d", r.Method, r.RequestURI, res.Status, res.ContentLength)     

with this:

log.Printf(`{"method":"%s","request_uri":"%s","status_code":%d,"content_length":%d}`, r.Method, r.RequestURI, res.StatusCode, res.ContentLength)

I avoid using JSON encode for a small performance gain (which is really minor for our current scale, but can matter in larger scales)

cconger commented 4 years ago

In our utilization we were planning on just stripping the included prepended flags, using a regular expression essentially targeting the :

I could see this as a configurable. In the simplest case it could just be a passthrough to the log.Flags call and the user could set LOG_FLAGS=19 as a config environment variable. The downside is users would have parse the bit mask constants https://golang.org/pkg/log/#pkg-constants. Another alternative would be to just be able to enable/disable the default.

LucasRoesler commented 4 years ago

I think that enable/disable the default is probably the simplest and most common case. Any other customization of the log output could be implemented in the function by the template/handler

Voziv commented 3 years ago

Having the ability to pass through logs from the function and letting the template/handler decide the log format would be super helpful

alexellis commented 3 years ago

@Voziv please can you improve the quality of your +1 here by stating what problem you're facing and which of the three outlined solutions would help you?

alexellis commented 3 years ago

1) Original of-watchdog as it is today:

cp /home/alex/go/src/github.com/openfaas/of-watchdog/of-watchdog . && mode=http upstream_url=http://127.0.0.1:3000 fprocess="node app.js" ./of-watchdog

2021/02/08 12:49:44 Forking - node [app.js]
2021/02/08 12:49:44 Started logging stderr from function.
2021/02/08 12:49:44 Started logging stdout from function.
2021/02/08 12:49:44 OperationalMode: http
2021/02/08 12:49:44 Timeouts: read: 10s, write: 10s hard: 10s.
2021/02/08 12:49:44 Listening on port: 8080
2021/02/08 12:49:44 Writing lock-file to: /tmp/.lock
2021/02/08 12:49:44 Metrics listening on port: 8081
2021/02/08 12:49:44 stdout: alexellis.io started on port: http://0.0.0.0:3000
2021/02/08 12:49:44 stderr: Disabled etag and x-powered-by
2021/02/08 12:49:47 GET / - 200 OK - ContentLength: 19686
2021/02/08 12:49:47 GET / - 200 OK - ContentLength: 19686
2021/02/08 12:49:48 GET / - 200 OK - ContentLength: 19686

2) Date prefix removed from logger

2021/02/08 12:17:39 Forking - node [app.js]
2021/02/08 12:17:39 Attached to: stderr.
2021/02/08 12:17:39 Attached to: stdout.
2021/02/08 12:17:39 OperationalMode: http
2021/02/08 12:17:39 Timeouts: read: 10s, write: 10s hard: 10s.
2021/02/08 12:17:39 Listening on port: 8080
2021/02/08 12:17:39 Writing lock-file to: /tmp/.lock
2021/02/08 12:17:39 Metrics listening on port: 8081
stdout: alexellis.io started on port: http://0.0.0.0:3000
stderr: Disabled etag and x-powered-by

3) No prefix at all - pass-through for all lines

http://127.0.0.1:3000 fprocess="node app.js" ./of-watchdog
2021/02/08 12:19:22 Forking - node [app.js]
2021/02/08 12:19:22 Attached to: stderr.
2021/02/08 12:19:22 Attached to: stdout.
2021/02/08 12:19:22 OperationalMode: http
2021/02/08 12:19:22 Timeouts: read: 10s, write: 10s hard: 10s.
2021/02/08 12:19:22 Listening on port: 8080
2021/02/08 12:19:22 Writing lock-file to: /tmp/.lock
2021/02/08 12:19:22 Metrics listening on port: 8081

alexellis.io started on port: http://0.0.0.0:3000
Disabled etag and x-powered-by

2021/02/08 12:19:27 GET / - 200 OK - ContentLength: 19686
2021/02/08 12:19:27 GET / - 200 OK - ContentLength: 19686

Note that the watchdog still outputs its own native logs which will have a prefix.

There are mixed requests here that we need to understand more clearly.

4) Do users want the watchdog to print all output in structured JSON, or just get a pass-through?

Note that if we use JSON format, your message will be double encoded as JSON.

{"level":"info","ts":1612788414.757614,"caller":"of-watchdog/main.go:364","msg":"Forking","command":"node","args":["app.js"]}
{"level":"info","ts":1612788414.757807,"caller":"executor/logging.go:13","msg":"Attached to","name":"stderr"}
{"level":"info","ts":1612788414.757837,"caller":"executor/logging.go:13","msg":"Attached to","name":"stdout"}
{"level":"info","ts":1612788414.7582288,"caller":"of-watchdog/main.go:74","msg":"OperationalMode\n","OperationalMode":"http"}
{"level":"info","ts":1612788414.758498,"caller":"of-watchdog/main.go:95","msg":"Timeouts","read":10,"write":10,"exec":10}
{"level":"info","ts":1612788414.75853,"caller":"of-watchdog/main.go:103","msg":"Listening on\n","port":8080}
{"level":"info","ts":1612788414.8450463,"caller":"executor/logging.go:25","msg":"Function output","name":"stdout","msg":"alexellis.io started on port: http://0.0.0.0:3000"}
{"level":"info","ts":1612788414.8452017,"caller":"executor/logging.go:25","msg":"Function output","name":"stderr","msg":"Disabled etag and x-powered-by"}
{"level":"info","ts":1612788417.779004,"caller":"executor/http_runner.go:173","msg":"Invoked","method":"GET","requestURI":"/","status":"200 OK","ContentLength":19686}
{"level":"info","ts":1612788418.3271103,"caller":"executor/http_runner.go:173","msg":"Invoked","method":"GET","requestURI":"/","status":"200 OK","ContentLength":19686}
{"level":"info","ts":1612788418.713564,"caller":"executor/http_runner.go:173","msg":"Invoked","method":"GET","requestURI":"/","status":"200 OK","ContentLength":19686}

Moving completely to JSON feels like a bad user experience for local testing and development of functions, and for the many hobbyist users who don't seem to care about log aggregation beyond the OpenFaaS API.

5) This is what Zap looks like with its "console" / non-JSON mode, pretty hard to read IMHO compared to the original.

1.612788359664378e+09   info    Forking {"command": "node", "args": ["app.js"]}
1.6127883596645575e+09  info    Attached to {"name": "stderr"}
1.6127883596645744e+09  info    Attached to {"name": "stdout"}
1.6127883596648636e+09  info    OperationalMode
    {"OperationalMode": "http"}
1.6127883596650236e+09  info    Timeouts    {"read": 10, "write": 10, "exec": 10}
1.6127883596650379e+09  info    Listening on
    {"port": 8080}
1.6127883597464366e+09  info    Function output {"name": "stdout", "msg": "alexellis.io started on port: http://0.0.0.0:3000"}
1.6127883597468245e+09  info    Function output {"name": "stderr", "msg": "Disabled etag and x-powered-by"}
1.6127883614455812e+09  info    Invoked {"method": "GET", "requestURI": "/", "status": "200 OK", "ContentLength": 19686}

@LucasRoesler and @Waterdrips use Logrus and it appears to have a more friendly console formatter built-in.

6) Finally, as Lucas pointed out, the OpenFaaS API already has a log API which returns text in JSON:

faas-cli logs nodeinfo --since 8h --format json
WARNING! Communication is not secure, please consider using HTTPS. Letsencrypt.org offers free SSL/TLS certificates.
{"name":"nodeinfo","namespace":"openfaas-fn","instance":"30337","timestamp":"2021-02-08T14:08:32.127985Z","text":"2021/02/08 14:08:32 GET / - 200 OK - ContentLength: 88"}
{"name":"nodeinfo","namespace":"openfaas-fn","instance":"30337","timestamp":"2021-02-08T14:08:33.064871Z","text":"2021/02/08 14:08:33 GET / - 200 OK - ContentLength: 88"}

And another example from Go:

{"name":"trove","namespace":"openfaas-fn","instance":"9174","timestamp":"2021-02-08T07:38:59.9926Z","text":"2021/02/08 07:38:59 stderr: 2021/02/08 07:38:59 Path: /login/"}
{"name":"trove","namespace":"openfaas-fn","instance":"9174","timestamp":"2021-02-08T07:38:59.999511Z","text":"2021/02/08 07:38:59 GET /login/?r= - 200 OK - ContentLength: -1"}
{"name":"trove","namespace":"openfaas-fn","instance":"9174","timestamp":"2021-02-08T07:44:10.144041Z","text":"2021/02/08 07:44:10 stderr: 2021/02/08 07:44:10 Path: /login/"}
{"name":"trove","namespace":"openfaas-fn","instance":"9174","timestamp":"2021-02-08T07:44:10.150358Z","text":"2021/02/08 07:44:10 GET /login/?r=// - 200 OK - ContentLength: -1"}
{"name":"trove","namespace":"openfaas-fn","instance":"9174","timestamp":"2021-02-08T07:51:35.67197Z","text":"2021/02/08 07:51:35 stderr: 2021/02/08 07:51:35 Path: /static/card.jpg"}
{"name":"trove","namespace":"openfaas-fn","instance":"9174","timestamp":"2021-02-08T07:51:35.889935Z","text":"2021/02/08 07:51:35 GET /static/card.jpg - 200 OK - ContentLength: 76695"}
{"name":"trove","namespace":"openfaas-fn","instance":"9174","timestamp":"2021-02-08T08:19:06.902559Z","text":"2021/02/08 08:19:06 stderr: 2021/02/08 08:19:06 Path: /static/card.jpg"}
{"name":"trove","namespace":"openfaas-fn","instance":"9174","timestamp":"2021-02-08T08:19:07.120483Z","text":"2021/02/08 08:19:07 GET /static/card.jpg - 200 OK - ContentLength: 76695"}
Voziv commented 3 years ago

If we move to a structured logger to output JSON and then some users want to see plaintext for debugging (frankly, I find the JSON hard to read).

Totally agree that json destroys readability if you're looking at the logs directly. Our case we only want to have json where logs are never looked at directly but instead via an aggregator (newrelic-logging in our case).

@Voziv please can you improve the quality of your +1 here by stating what problem you're facing and which of the three outlined solutions would help you?

Absolutely. The main issue I came looking for is the ability to just have template/function logs passed through.

Current State:

2021/02/08 12:30:00 POST / - 200 OK - ContentLength: 16
2021/02/08 13:00:00 stdout: {"message":{} ,"level":"info"}
2021/02/08 13:00:00 stdout: {"message":"Requested function: <snip>", "level":"info"}

So what I can see is we have three different pieces logging

  1. 2021/02/08 12:19:22 Forking - node [app.js] - I think this is the of-watchdog image
  2. POST / - 200 OK - ContentLength: 16 - I think this is the node template/express output?
  3. {"message":{} ,"level":"info"} - The function ouput. Could be considered the same as the template output since the user in control of this.

In my current use case we really only care about the function output being pure json when deployed to the cluster. The other lines will get parsed just fine as-is. So we'd prefer the ability to strip the date/time prefix & stdout/stderr portion just leaving the functions raw output being logged.

While it would be nice for all 3 types of logs to be json to keep things consistent in the cluster, it's less important as the main benefit of json is being able to include extra metadata in the log entries that can be index and searched.

alexellis commented 3 years ago

So would this line from the watchdog not break your log parser, if it's expecting a strongly typed JSON output?

2021/02/08 12:30:00 POST / - 200 OK - ContentLength: 16

cc @saginadir if you have time to give your 2c here

Voziv commented 3 years ago

Cluster logs are injested from the container log files on the node. At least in the case of newrelic-logging they're parsed by individual lines so lines that are pure json (and likely has a requirement to have a date field + message field) will be parsed as json. Other plain text logs should parsed normally so it should be fine.

I could be wrong about this though.

saginadir commented 3 years ago

@alexellis From my experience I needed a log that would look like that:

{"level":"info","ts":1612788414.757807,"caller":"whatever.go:13", "method": "POST", "path": "/", "statusCode": 200, "ContentLength": 15}

For hobbiest it's a terrible experience to read JSON logs like you mentioned. But working in a big organization with a rigid DevOps team can sometimes make you jump through hoops in order to abide by their guidelines. For us, we were using ElasticSearch to aggregate the stdout output of all production nodes in k8s cluster. So I couldn't invoke the faas-cli logs nodeinfo --since 8h --format json command and needed the process itself to print outputs in pure JSON.

I think it's quite common today to have the entire log be printed in JSON, example with zap:

logger, _ = zap.Config{
    Encoding:    "json",
    Level:       zap.NewAtomicLevelAt(zapcore.DebugLevel),
    OutputPaths: []string{"stdout"},
    EncoderConfig: zapcore.EncoderConfig{  
        MessageKey: "message",  // <--
    },
}.Build()

logger.Info("This is an INFO message with fields", zap.String("region", "us-west"), zap.Int("id", 2))

Output: {"message":"This is an INFO message with fields","region":"us-west","id":2}

So at the time I would have greatly appreciated some env variable to pass to the watchdog node like LOG_ENCODING=json and would have the log formatter just print JSON instead of the standard log format. But ended up just hardcoding a fork of watchdog to print JSON.

alexellis commented 3 years ago

Please see https://github.com/openfaas/of-watchdog/commit/51ea0d642955a011bd1f2f0be64319721de2f2e2 which adds a mechanism to disable prefixes to function logs.

This is option 3 (No prefix at all - pass-through for all lines)