envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.99k stars 4.81k forks source link

Question about Connection header processing in HTTP/1.1 requests #36080

Closed khe0613 closed 1 week ago

khe0613 commented 1 month ago

Hello, I have some questions about Envoy's handling of the Connection header when communicating with HTTP/1.1 in a Client -> Envoy -> Server structure.


Note

I am using Envoy 1.31.0

 

Question

  1. When communicating with a typical REST API (not streaming) between Client and Server, Envoy appears to strip the Connection header from the request header and response header. The reason for removing the Connection header appears to be to comply with HTTP/1.1 (RFC 7230), which states that “hop-by-hop” headers should not be retransmitted by proxies, is that correct?
  2. And the source code to remove Connection headers from requests/responses seems to be ConnectionManagerUtility::mutateRequestHeaders and ConnectionManagerUtility::mutateResponseHeaders in source/common/http/conn_manager_impl.cc, is that correct?
  3. However, for streaming responses (Chunked Transfer Encoding), which are not typical REST API responses, the Connection header is not removed from the response header, but is forced to be set to “Connection: close”. This seems to be done by the HTTP Connection Manager, can anyone tell me why it is being handled this way and the relevant source code?
  4. (Side question unrelated to the title) For streaming responses (Chunked Transfer Encoding), the Envoy Prometheus metric called envoy_http_downstream_rq_time_bucket doesn't seem to be collected normally. The metric is only collected occasionally, any idea why?

 

Additional explanation for question 3

Server explicitly sets “Connection: keep-alive” in the response header, which is changed to “Connection: close” by Envoy and responds to the client.

[Request headers sent by the client]
GET /audio/download HTTP/1.1
Host: localhost:10000
User-Agent: curl/8.6.0
Accept: */*
Connection: keep-alive
Transfer-Encoding: chunked
Content-Type: application/octet-stream

[Response headers received by the client]
HTTP/1.1 200 OK
content-disposition: attachment; filename="received_audio.wav"
content-type: application/octet-stream
content-length: 3364524
date: Thu, 12 Sep 2024 02:39:01 GMT
server: envoy
connection: close  # Why does Envoy force “Connection: close”?

Please comment if you need any additional information. Thank you.

ravenblackx commented 1 month ago

cc @alyssawilk

alyssawilk commented 1 month ago

Sorry for the delay in answering. As you called out, Envoy does remove the connection header from both requests and responses, as that header is hop by hop. Envoy will add back a connection: close header if it is unable to serve more requests on the connection. There are a number of reasons this can happen. If there's an early response, where Envoy has not read a full POST body. If the upstream sends a chunked response with no content length, and the downstream doesn't handle chunked encoding. If the connection is being drained due to request limits and lifetime issues. If you turn on debug logging the cause of the close may be in the logs, and the response details may shed some light as well.

khe0613 commented 1 month ago

@alyssawilk

I changed Envoy's logging level to trace as you suggested and noticed that when I send a HTTP/1.1 request with “Transfer-Encoding: chunked” in the request header, “Connection: close” is added to the response headers by the void ConnectionManagerImpl::ActiveStream::encodeHeaders(ResponseHeaderMap& headers, bool end_stream) method in conn_manager_impl.cc

[2024-09-11 05:47:52.661][15][trace][router] [source/common/router/upstream_request.cc:271] [Tags: "ConnectionId":"9","StreamId":"4310233326485317794"] end_stream: false, upstream response headers:
':status', '200'
'connection', 'keep-alive'
'content-disposition', 'attachment; filename="received_audio.wav"'
'content-type', 'application/octet-stream'
'content-length', '1079162'
'date', 'Wed, 11 Sep 2024 05:47:52 GMT'

[2024-09-11 05:47:52.661][15][debug][router] [source/common/router/router.cc:1551] [Tags: "ConnectionId":"9","StreamId":"4310233326485317794"] upstream headers complete: end_stream=false
[2024-09-11 05:47:52.661][15][trace][misc] [source/common/event/scaled_range_timer_manager_impl.cc:60] enableTimer called on 0x6d47f703e80 for 300000ms, min is 300000ms
[2024-09-11 05:47:52.661][15][trace][http] [source/common/http/filter_manager.cc:905] [Tags: "ConnectionId":"9","StreamId":"4310233326485317794"] commonEncodePrefix end_stream: false, isHalfCloseEnabled: false
[2024-09-11 05:47:52.661][15][debug][http] [source/common/http/conn_manager_impl.cc:1825] [Tags: "ConnectionId":"9","StreamId":"4310233326485317794"] encoding headers via codec (end_stream=false):
':status', '200'
'content-disposition', 'attachment; filename="received_audio.wav"'
'content-type', 'application/octet-stream'
'content-length', '1079162'
'date', 'Wed, 11 Sep 2024 05:47:52 GMT'
'server', 'envoy'
'connection', 'close'  # Added by http connection manager

I was curious how it was behaving this way, so I checked the code of the method void ConnectionManagerImpl::ActiveStream::encodeHeaders(ResponseHeaderMap& headers, bool end_stream)  

  1. If we send a HTTP/1.1 request with “Transfer-Encoding: chunked” in the request header, the value of the connection_manager_.drain_state_ variable is set to DrainState::Closing because filter_manager_.remoteDecodeComplete() returns false.

    // If we are destroying a stream before remote is complete and the connection does not support
    // multiplexing, we should disconnect since we don't want to wait around for the request to
    // finish.
    if (!filter_manager_.remoteDecodeComplete()) {
    if (connection_manager_.codec_->protocol() < Protocol::Http2) {
      connection_manager_.drain_state_ = DrainState::Closing;
    }
    
    connection_manager_.stats_.named_.downstream_rq_response_before_rq_complete_.inc();
    }

 

  1. Since the value of the connection_manager_.drain_state_ variable is set to DrainState::Closing, and the communication protocol is HTTP/1.1, the response header is forced to set “Connection: close”.
    if (connection_manager_.drain_state_ != DrainState::NotDraining &&
      connection_manager_.codec_->protocol() < Protocol::Http2) {
    // If the connection manager is draining send "Connection: Close" on HTTP/1.1 connections.
    // Do not do this for H2 (which drains via GOAWAY) or Upgrade or CONNECT (as the
    // payload is no longer HTTP/1.1)
    if (!state_.is_tunneling_) {
      headers.setReferenceConnection(Headers::get().ConnectionValues.Close);
    }
    }

 

My question here is why the value of the connection_manager_.drain_state_ variable must be set to the value of DrainState::Closing when filter_manager_.remoteDecodeComplete() returns false.  

  // If we are destroying a stream before remote is complete and the connection does not support
  // multiplexing, we should disconnect since we don't want to wait around for the request to
  // finish.

Although the conditional statement is annotated as above, I don't see why we need to disconnect connection as we can't wait for the HTTP/1.1 chunked request to complete.

alyssawilk commented 1 month ago

if remote decoding is false, it means Envoy hasn't read the full client request before sending the response. Envoy is not configured to drain the rest of the unframed request, so the connection becomes unusable for follow-up request, hence the connection: close

khe0613 commented 1 month ago

@alyssawilk Thank you for your quick response.  

Envoy is not configured to drain the rest of the unframed request

Why Envoy is not configured to drain the rest of the unframed request?

 

so the connection becomes unusable for follow-up request, hence the connection: close

Is this to prevent resource leaks where the connection is not released to the connection pool forever because the remaining unframed requests are not drained?

alyssawilk commented 1 month ago

This is an issue with the Envoy-client connection right? The connection pool is Envoy-upstream Traditionally the value of draining an unread body for HTTP/1.1 early responses hasn't been deemed worth the feature work, especially given the bulk of internet traffic is now HTTP/2 or HTTP/3. If someone wanted to add this as a config option I think it would be a welcome addition but generally for 5xx errors you can configure buffering and retries, and draining the connection for 4xx just isn't a win.

khe0613 commented 1 month ago

@alyssawilk Yes, the question was about downstream (Envoy-client) connection. I misspoke about the connection pool.

The feature to drain unread(unframed) request body for the early response was not added because it was not needed by the specification or nature of HTTP/1.1, and it was my understanding that the http connection manager adds “Connection: close” to the response header to prevent connection resource leakage(The phenomenon of an unusable connection being alive) that can be caused by undrained request body. Is that correct?

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

github-actions[bot] commented 1 week ago

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

khe0613 commented 4 days ago

@alyssawilk Could you please answer the question I asked at https://github.com/envoyproxy/envoy/issues/36080#issuecomment-2375593266?

alyssawilk commented 4 days ago

correct.