zalando / skipper

An HTTP router and reverse proxy for service composition, including use cases like Kubernetes Ingress
https://opensource.zalando.com/skipper/
Other
3.1k stars 349 forks source link

flowId in application logs #423

Open DeTeam opened 7 years ago

DeTeam commented 7 years ago

At the moment flow id is only available in the access logs, would be handy to have it in some application logs (whenever available) as well.

szuecs commented 7 years ago

@DeTeam I think we should pass it as a header X-FlowID or so. What do you think @DeTeam @aryszka @lmineiro ?

DeTeam commented 7 years ago

@szuecs it's already taken/passed further and even used in access logs, since we have two types of logs: access and application I felt sometimes missing it in the latter.

szuecs commented 7 years ago

@DeTeam then I don't see, why this issue belongs to skipper. I would not log this in skipper, only if access logs are enabled then it should be in the access-logs. Maybe you can explain why you want to have flowids in skipper application logs?

DeTeam commented 7 years ago

@szuecs

Let's say I have my custom filter triggered as part of some route handling process. This filter does something and it logs certain info. Would be useful to be able to see that some of filter log records and requests belong together.

szuecs commented 7 years ago

I can only imagine two cases where it makes sense for me to log in a proxy:

  1. access logs
  2. audit logs

Everything else should be logged in backends or exposed as metrics.

Skipper is at first a library to build http routers, so it should not log. Instead it should return errors to be used from the application. Maybe here is something missing to get data from the filter to the application. I guess we could build something like a channel that an application subscribes to and pass data events from filters to this channel. We could build a DiscardChannel in case your application does not want to get notified with any data to be efficient in case we need to. What do you think?

aryszka commented 7 years ago

i think that logging in production environments should be only error logging, when an error happens, we typically log the message in the application log and return a 50x which is also logged in the access log.

When the error is not a normal operation but a bug in skipper or in a filter, it can be very useful to see which requests trigger the error and which route is executed. This is especially true when skipper is routing to many different, heterogenous backends. Using the flow id to couple the application log and access log can be simple way to have this info.

I get how having the flowids in the application logs is not necessarily the "right thing" in principle, but it can be very useful in practice.

Other opinions?

aryszka commented 7 years ago

+1 for this: "I guess we could build something like a channel that an application subscribes to and pass data events from filters to this channel."

grassator commented 6 years ago

Having debugged some problems in skipper filters, I definitely think it will be useful to have flow id per request and I would like to take a stab at the implementation

However I see 2 options of when / how we do this.

Option 1. Pre-generate

The idea is to generate a value for a flow id, right at the edge and stick it in the logger probably in https://github.com/zalando/skipper/blob/master/proxy/proxy.go#L780 (using withFields) and add that logger to the context. Then we can use that logger in the filters instead of a global one.

Then we change flowId filter to look for that pre-generated id and add it as a header if necessary.

The problem with this approach is the configuration of whether we want to reuse incoming header or always generate a new one.

The advantage though is we can correlate more logs (e.g. also from proxy itself) and will always have this correlation id for a request

Option 2. Always look at the header

We can create a custom logger implementation that always checks for X-Flow-Id header before logging. This is less invasive and resolves the issue with the reuse option, but probably is slower and will correlate less logs.


Regardless of the option, having logger on the context will allow for nice things like per-request log level that could really help debug things when necessary while having a less logs in general.

@aryszka What do you think?

aryszka commented 6 years ago

I prefer Option 1. For the incoming header problem, I would default to reusing, and there could be an option to force new flow-ids.

The interface in the context could be provided in a similar way to the custom metrics: https://godoc.org/github.com/zalando/skipper/filters#Metrics