envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
25.02k stars 4.82k forks source link

HPE_INVALID_URL with non-urlencoded args #12073

Open m-messiah opened 4 years ago

m-messiah commented 4 years ago

I have an envoy as a http1.1 proxy with default config and when I try to access some non-unicode nonencoded params in uri - I have error 400 with HPE_INVALID_URL in debug log without any information. normalize_path is off by default and there is not a path - it is query string, so this behaviour looks strange.

Repro steps:

  listeners:
  - name: local_listener
    address:
      socket_address:
        protocol: TCP
        address: ::1
        port_value: 12701
    drain_type: MODIFY_ONLY
    traffic_direction: OUTBOUND
    filter_chains:
      - filters:
        - name: envoy.http_connection_manager
          typed_config:
            "@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
            stat_prefix: local_balancer
            normalize_path: false
            route_config:
              name: local_balancer_route
              request_headers_to_add:
                - header:
                    key: "X-Local-Proxy"
                    value: "%HOSTNAME%"
              virtual_hosts:
                - name: localhost
                  domains: ["localhost:12701"]
                  routes:
                    - match: { prefix: "/" }
                       route:
                        cluster: example_cluster
                        timeout: 600s
                        retry_policy:
                          retry_on: gateway-error
                          num_retries: 5

cluster config omitted - default 80 port.

Request: curl 'http://localhost:12701/test?arg=Юникодные символы'

Expect: Correct proxy pass to backend Answer: 400 Bad Request

m-messiah commented 4 years ago

Yes, I have read https://tools.ietf.org/html/rfc3986#section-2 but other tools work with these characters correctly by passing them further (for example, curl, nginx, uwsgi)

yanavlasov commented 4 years ago

I think by default we compile http1 parser in HTTP_PARSER_STRICT mode, which limits URL characters to US_ASCII if I read the code correctly. This isn't something configuration controlled, you need to rebuild envoy to allow UTF characters in URLs. I have no sense for how important handling of such URLs is. I'll let others chime in.

yanavlasov commented 4 years ago

I've asked for opinions from the maintainer and nobody had any. I think the best way for to proceed is to actually implement the feature if you have dev experience. Otherwise I do not think it will gain any traction at this point.

sanya2022 commented 3 years ago

Hello. We have a problem with non-urlencoded args too. Did anyone try to resolve this issue?

sschepens commented 3 years ago

Hi, we're having issues with this too, doing some reasearch it would seem that a lot of utilities, languages or frameworks support utf-8 characters in URLs, including Go, Nginx and more.

Rejecting utf-8 characters in urls could be troublesome. Maybe envoy could compile http_parser in non-strict mode and do validations inside envoy so that they can be configurable?

I don't know how this stands in regards to #15814 which is changing the parser to llhttp

Edit: also maybe related to #13358 ?

jinnypark9393 commented 2 years ago

Hi, we have this issue as well. Did anyone resolve this issue or find bypass of this issue?

3-byte wide UTF-8 character in query params works fine with Envoy on Chrome browser and IE11 with https protocol but we have 400 error when we use IE11 with http protocol.