envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
25.01k stars 4.81k forks source link

The order of `access_log` handler settings may not be the best. #14767

Open wangfakang opened 3 years ago

wangfakang commented 3 years ago

Title: The order of access_log handler settings may not be the best

Description

Now access_log is always runs before each filter log phase of HTTP, which will cause some operations in the log phase of http filter to not take effect in the access_log handler. Forexample, if the metadata xxx :yy is set in the log phase of HTTP filter, then %DYNAMIC_METADATA(xxx :yy)% in access_log format does not work.

Analysis

access_log is registered at the ActiveStream.

ConnectionManagerImpl::ActiveStream::ActiveStream(ConnectionManagerImpl& connection_manager,
                                                  uint32_t buffer_limit) {
  ……
  for (const AccessLog::InstanceSharedPtr& access_log : connection_manager_.config_.accessLogs()) {
    filter_manager_.addAccessLogHandler(access_log); 
  }
  ……
}

each filter log handler of HTTP is registered at the decodeHeaders.

void ConnectionManagerImpl::ActiveStream::decodeHeaders(RequestHeaderMapPtr&& headers,
                                                        bool end_stream) {
……
 const bool upgrade_rejected = filter_manager_.createFilterChain() == false;   // will call filter_manager_.addAccessLogHandler
……
}

The log handler is registered using push_back, So it means FIFO.

void FilterManager::addAccessLogHandler(AccessLog::InstanceSharedPtr handler) {
  access_log_handlers_.push_back(handler);
}

So the access_log handler will be called first during the log phase.

void ConnectionManagerImpl::doDeferredStreamDestroy(ActiveStream& stream) {
  ……
  stream.completeRequest();
  stream.filter_manager_.onStreamComplete();

  stream.filter_manager_.log();  //  access_log and http_filter_log handler will be called here

  stream.filter_manager_.destroyFilters();
……
}

void log() {
……
    for (const auto& log_handler : access_log_handlers_) {   // access_log handler will be called first
      log_handler->log(request_headers, response_headers, response_trailers, stream_info_);
    }
……
  }

Expect

The access_log handler may be better to runs after all log handler of http filter.

wangfakang commented 3 years ago

@mattklein123 @htuch Plz help reviews it, I can pull PR if that's ok.

htuch commented 3 years ago

Checking my understanding. You are saying that the per-filter access log handlers run after the access log sink handlers registered with HCM?

It seems reasonable for them to run before, but it seems kind of hacky to be mutating stream state at this point when the stream is undergoing destruction.

wangfakang commented 3 years ago

Checking my understanding. You are saying that the per-filter access log handlers run after the access log sink handlers registered with HCM?

Yeah. It means per-filter-log and access-log handler should be runs in the same order as the p2 picture.

image

It seems reasonable for them to run before, but it seems kind of hacky to be mutating stream state at this point when the stream is undergoing destruction.

Usually we might need to do something in the filter log phase and then log it via the access_log handler, but this is invalid now. Forexample we create the metadata xxx :yy is set in the log phase of HTTP filter, then the %DYNAMIC_METADATA(xxx :yy)% in access_log format does not work.

So I propose that the runtime of the access_log handler should be placed at the end of the per-filter log, Just like the picture p2 above.

wangfakang commented 3 years ago

ping @htuch

htuch commented 3 years ago

This seems reasonable, although I still don't fully grok why you do dynamic metadata creation during destruction.

wangfakang commented 3 years ago

This seems reasonable, although I still don't fully grok why you do dynamic metadata creation during destruction.

@htuch Forexample, we need to unify the processing time of some http filter, and then record the statistical results to the access log through dynamic metadata.

TAOXUY commented 1 year ago

+1 on this request

spacewander commented 3 months ago

@htuch @mattklein123 What about adding a new method prependAccessLogHandler? Adding this method won't break any code. If a filter wants to record the statistical results to the access log through dynamic metadata, it can choose the new method. The addAccessLogHandler can still be used by the HCM to insert multiple access log handlers in the current order.

This also solves #30859 easily: Just change one line addAccessLogHandler to prependAccessLogHandler.

htuch commented 3 months ago

Sounds reasonable to me, @mattklein123?

Elliot-xq commented 3 months ago

Hey @spacewander, could you elaborate more on the prependAccessLogHandler? is prependAccessLogHandler still adding loggers to the access_loghandlers? Why could it ensure that access_logger logs after the filter log?

to provide some context: Seems we still require fix for https://github.com/envoyproxy/envoy/issues/30859 . We would love to make the prependAccessLogHandler changes if it could solve our problem.

mattklein123 commented 3 months ago

Sure sounds good to me.

spacewander commented 3 months ago

Hey @spacewander, could you elaborate more on the prependAccessLogHandler? is prependAccessLogHandler still adding loggers to the access_loghandlers? Why could it ensure that access_logger logs after the filter log?

to provide some context: Seems we still require fix for #30859 . We would love to make the prependAccessLogHandler changes if it could solve our problem.

@Elliot-xq

Yes, prependAccessLogHandler still adds loggers to the access_log_handlers_, just in a different direction.

The current status is:

  1. Envoy adds the access log handler by appending it to the end
  2. the access_loggers are added before the filter loggers

So if we provide a new method to add an access log handler by prepending it, we can use this method to add the filter log before the access_loggers, without changing the execution order (just change the method we call in the filter).

I am also going to change the golang filter's access log, which is a part of my recent job to improve its access log mechanism. Hope this would be helpful to you.

Elliot-xq commented 3 months ago

Thanks for the explanation @spacewander !!