envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.61k stars 4.74k forks source link

ext-authz cannot be used with DNS. #4637

Open taion809 opened 5 years ago

taion809 commented 5 years ago

Issue Template

Title: ext-authz cannot be used with DNS.

Description:

When using a statically configured ext-authz cluster that has a cluster type of STRICT_DNS or LOGICAL_DNS the ext-authz filter will use the incoming host header for routing requests. This is problematic because if the wrong host is selected during routing traffic will flow to a cluster that may return 200 and allow traffic through where it shouldn't.

Repro steps:

Config:

filters:
  - name: envoy.ext_authz
    stat_prefix: ext_authz
    grpc_service:
      envoy_grpc:
        cluster_name: ext-authz

clusters:
  - name: ext-authz
    type: STRICT_DNS
    http2_protocol_options: {}
    hosts:
      - socket_address: { address: auth.local, port_value: 80 }
mattklein123 commented 5 years ago

This sounds problematic if true though I would be surprised if this is how it works. @gsagula?

gsagula commented 5 years ago

@taion809 The ext-authz filter does not use host header for routing requests. It just passes the attributes in the HTTP request to the authorization server. Please, see check request: https://www.envoyproxy.io/docs/envoy/latest/configuration/network_filters/ext_authz_filter.html?highlight=authz%20filter

taion809 commented 5 years ago

I'm going to close this for now; there is some shenanigans with curl that need further investigation. I wasn't able to reproduce this when deployed out to the cluster only locally.

taion809 commented 5 years ago

Hey, I was able to reproduce this reliably. The filter in question is this: ext-authz http filter and not the network filter of the same name.

Given the following sample envoy config

node:
  id: machine001
  cluster: envoy.local
  locality:
    region: eastus2

admin:
  access_log_path: /tmp/admin_access.log
  address:
    socket_address:
      protocol: TCP
      address: 127.0.0.1
      port_value: 9901
static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        protocol: TCP
        address: 0.0.0.0
        port_value: 80
    filter_chains:
    - filters:
      - name: envoy.http_connection_manager
        config:
          stat_prefix: ingress_http
          route_config:
            name: local_route
            virtual_hosts:
            - name: foo
              domains: ["foo-svc.envoy.local", "foo-svc.envoy.local:80"]
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: foo
          http_filters:
          - name: envoy.ext_authz
            config:
              http_service:
                server_uri:
                  uri: http://ext-auth.envoy.remote/
                  cluster: ext-authz
                  timeout: 0.25s
              failure_mode_allow: false
          - name: envoy.health_check
            config:
              pass_through_mode: false
              headers:
                - name: ":path"
                  prefix_match: "/_internal/health"
          - name: envoy.router
  clusters:
  - name: ext-authz
    connect_timeout: 3.25s
    type: LOGICAL_DNS
    lb_policy: LEAST_REQUEST
    hosts:
    - socket_address:
        address: ext-auth.envoy.remote
        port_value: 80
  - name: foo
    connect_timeout: 3.25s
    type: LOGICAL_DNS
    lb_policy: LEAST_REQUEST
    hosts:
    - socket_address:
        address: foo-svc.envoy.local
        port_value: 80

I issued the following http request

http get foo-svc.envoy.local

The following packet dump was generated by running

> dig ext-authz.envoy.remote
;; ANSWER SECTION:
ext-authz.envoy.remote. 3600 IN A 10.0.1.1 (load balancer IP)

> tshark -f 'tcp port 80 and host 10.0.1.1' -V

The ip address is the IP of a load balancer, this issue can be reproduced without an external loadbalancer in the mix and use the local envoy cluster.

Packet dump

Transmission Control Protocol, Src Port: 58762, Dst Port: 80, Seq: 1, Ack: 1, Len: 186
    Source Port: 58762
    Destination Port: 80
    [Stream index: 0]
    [TCP Segment Len: 186]
    Sequence number: 1    (relative sequence number)
    [Next sequence number: 187    (relative sequence number)]
    Acknowledgment number: 1    (relative ack number)
    1000 .... = Header Length: 32 bytes (8)
    Flags: 0x018 (PSH, ACK)
        000. .... .... = Reserved: Not set
        ...0 .... .... = Nonce: Not set
        .... 0... .... = Congestion Window Reduced (CWR): Not set
        .... .0.. .... = ECN-Echo: Not set
        .... ..0. .... = Urgent: Not set
        .... ...1 .... = Acknowledgment: Set
        .... .... 1... = Push: Set
        .... .... .0.. = Reset: Not set
        .... .... ..0. = Syn: Not set
        .... .... ...0 = Fin: Not set
        [TCP Flags: ·······AP···]
    Window size value: 262
    [Calculated window size: 262]
    [Window size scaling factor: -1 (unknown)]
    Checksum: 0x22cb [unverified]
    [Checksum Status: Unverified]
    Urgent pointer: 0
    Options: (12 bytes), No-Operation (NOP), No-Operation (NOP), Timestamps
        TCP Option - No-Operation (NOP)
            Kind: No-Operation (1)
        TCP Option - No-Operation (NOP)
            Kind: No-Operation (1)
        TCP Option - Timestamps: TSval 1691091502, TSecr 2895874851
            Kind: Time Stamp Option (8)
            Length: 10
            Timestamp value: 1691091502
            Timestamp echo reply: 2895874851
    [SEQ/ACK analysis]
        [Bytes in flight: 186]
        [Bytes sent since last PSH flag: 186]
    [Timestamps]
        [Time since first frame in this TCP stream: 0.000000000 seconds]
        [Time since previous frame in this TCP stream: 0.000000000 seconds]
    TCP payload (186 bytes)
Hypertext Transfer Protocol
    GET / HTTP/1.1\r\n
        [Expert Info (Chat/Sequence): GET / HTTP/1.1\r\n]
            [GET / HTTP/1.1\r\n]
            [Severity level: Chat]
            [Group: Sequence]
        Request Method: GET
        Request URI: /
        Request Version: HTTP/1.1
    host: foo-svc.envoy.local\r\n
    content-length: 0\r\n
        [Content length: 0]
    x-envoy-internal: true\r\n
    x-forwarded-for: 10.0.0.1\r\n
    x-envoy-expected-rq-timeout-ms: 3250\r\n
    \r\n
    [Full request URI: http://foo-svc.envoy.local/]
    [HTTP request 1/1]

Transmission Control Protocol, Src Port: 80, Dst Port: 58762, Seq: 1, Ack: 187, Len: 388
    Source Port: 80
    Destination Port: 58762
    [Stream index: 0]
    [TCP Segment Len: 388]
    Sequence number: 1    (relative sequence number)
    [Next sequence number: 389    (relative sequence number)]
    Acknowledgment number: 187    (relative ack number)
    1000 .... = Header Length: 32 bytes (8)
    Flags: 0x018 (PSH, ACK)
        000. .... .... = Reserved: Not set
        ...0 .... .... = Nonce: Not set
        .... 0... .... = Congestion Window Reduced (CWR): Not set
        .... .0.. .... = ECN-Echo: Not set
        .... ..0. .... = Urgent: Not set
        .... ...1 .... = Acknowledgment: Set
        .... .... 1... = Push: Set
        .... .... .0.. = Reset: Not set
        .... .... ..0. = Syn: Not set
        .... .... ...0 = Fin: Not set
        [TCP Flags: ·······AP···]
    Window size value: 269
    [Calculated window size: 269]
    [Window size scaling factor: -1 (unknown)]
    Checksum: 0x547d [unverified]
    [Checksum Status: Unverified]
    Urgent pointer: 0
    Options: (12 bytes), No-Operation (NOP), No-Operation (NOP), Timestamps
        TCP Option - No-Operation (NOP)
            Kind: No-Operation (1)
        TCP Option - No-Operation (NOP)
            Kind: No-Operation (1)
        TCP Option - Timestamps: TSval 2895990049, TSecr 1691091502
            Kind: Time Stamp Option (8)
            Length: 10
            Timestamp value: 2895990049
            Timestamp echo reply: 1691091502
    [SEQ/ACK analysis]
        [This is an ACK to the segment in frame: 1]
        [The RTT to ACK the segment was: 0.000577400 seconds]
        [Bytes in flight: 388]
        [Bytes sent since last PSH flag: 388]
    [Timestamps]
        [Time since first frame in this TCP stream: 0.000577400 seconds]
        [Time since previous frame in this TCP stream: 0.000577400 seconds]
    TCP payload (388 bytes)
Hypertext Transfer Protocol
    HTTP/1.1 503 Service Unavailable\r\n
        [Expert Info (Chat/Sequence): HTTP/1.1 503 Service Unavailable\r\n]
            [HTTP/1.1 503 Service Unavailable\r\n]
            [Severity level: Chat]
            [Group: Sequence]
        Response Version: HTTP/1.1
        Status Code: 503
        [Status Code Description: Service Unavailable]
        Response Phrase: Service Unavailable
    location: http://foo-svc.envoy.local/\r\n
    strict-transport-security: max-age=31536000; includeSubDomains;\r\n
    referrer-policy: strict-origin-when-cross-origin\r\n
    x-frame-options: SAMEORIGIN\r\n
    x-xss-protection: 1; mode=block;\r\n
    x-content-type-options: nosniff\r\n
    date: Mon, 29 Oct 2018 20:21:11 GMT\r\n
    server: envoy\r\n
    content-length: 0\r\n
        [Content length: 0]
    \r\n
    [HTTP response 1/1]
    [Time since request: 0.000577400 seconds]
    [Request in frame: 1]

in this case the foo-svc is not a member of the envoy.remote cluster and so envoy responds with a 503.

It seems that the http filter is using the incoming host header as the ext-authz host header. Looking at the code here it does seem like the incoming request headers are being used as the outgoing request headers.

mattklein123 commented 5 years ago

Yes we should be setting the host name of the upstream cluster to that upstream cluster or also maybe using the "auto host" option that the cluster already supports. I will switch this over to bug/help wanted. cc @gsagula

gsagula commented 5 years ago

@taion809 Thanks for reporting it.

@mattklein123 I'm currently working on this enhancements for ext-authz. I can either include this fix to https://github.com/envoyproxy/envoy/issues/4756 or, if preferable, I can tackle it separately when I finished 4756.

mattklein123 commented 5 years ago

@gsagula separate PR preferred please!

gsagula commented 5 years ago

/assign gsagula

rr-sarvesh-padia commented 2 years ago

Hi @gsagula, @mattklein123 any update on this issue? Is there any workaround to fix this? Is this issue only happening for http_service or this applies to grpc_service as well? Thanks