grpc / grpc-web

gRPC for Web Clients
https://grpc.io
Apache License 2.0
8.45k stars 761 forks source link

Getting 503 errors when trying to communicate with Envoy using TLS on localhost #1333

Closed shanifdhanani closed 1 year ago

shanifdhanani commented 1 year ago

I posted this on StackOverflow, but wanted to post it here too in case any of the experts here might be able to help, apologies if that's not the right way to do this.

Anyway, I'm building a GRPC-enabled service that handles API requests from a ReactJS frontend using Envoy as a proxy. All services are currently running on my local machine for development.

Everything is running fine without TLS, and now I'm trying to configure a more secure connection between all services.

I added TLS to the backend GRPC service using self-signed certificates using the following Java code:

Backend server code

InputStream serverCertInputStream = getClass().getClassLoader().getResourceAsStream("localhost/server.crt");
InputStream privateKeyInputStream = getClass().getClassLoader().getResourceAsStream("localhost/server.pem");
ServerCredentials creds = TlsServerCredentials.create(serverCertInputStream, privateKeyInputStream);

server = Grpc.newServerBuilderForPort(this.port, creds)
    .addService(
        ServerInterceptors.intercept(
            new ApiServiceImplementer(),
            new AuthorizationInterceptor(ApiServiceConfig.getJwtIssuer(), ApiServiceConfig.getJwtAudience(), PublicKeyContainer.DefaultAuth0PublicRsaKeyForDevelopment)
        ))
    .addService(ProtoReflectionService.newInstance())
    .build();

This is working fine, which I verified by using a few other backend Java services as test clients, which could successfully reach this service without going through Envoy.

However, after updating my Envoy config to include the necessary updates for TLS (full config below), I keep running into issues.

I haven't been able to get any of my frontend requests to get past Envoy and sent to the backend GRPC server.

Here is my Envoy config:

static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address: { address: 0.0.0.0, port_value: 8079 }
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                codec_type: auto
                stat_prefix: ingress_http
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: ["*"]
                      routes:
                        - match: { prefix: "/" }
                          route:
                            cluster: api_service
                            max_grpc_timeout: 60s
                      cors:
                        allow_origin_string_match:
                          - prefix: "*"
                        allow_methods: GET, PUT, DELETE, POST, OPTIONS
                        allow_headers: accept,access-control-allow-origin,origin,accept-encoding,accept-language,connection,authorization,keep-alive,user-agent,cache-control,content-type,content-transfer-encoding,custom-header-1,x-accept-content,x-accept-content-transfer-encoding,x-accept-response-streaming,x-user-agent,x-grpc-web,grpc-timeout
                        max_age: "1728000"
                        expose_headers: custom-header-1,grpc-status,grpc-message,grpc-status-details-bin,authorization
                http_filters:
                  - name: envoy.filters.http.grpc_web
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.grpc_web.v3.GrpcWeb
                  - name: envoy.filters.http.cors
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.cors.v3.Cors
                  - name: envoy.filters.http.router
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
          transport_socket:
            name: envoy.transport_sockets.tls
            typed_config:
              # https://www.envoyproxy.io/docs/envoy/v1.15.0/api-v3/extensions/transport_sockets/tls/v3/tls.proto#extensions-transport-sockets-tls-v3-downstreamtlscontext
              "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
              common_tls_context:
                tls_certificates:
                  - certificate_chain:
                      filename: /etc/server.crt
                    private_key:
                      filename: /etc/server.pem

  clusters:
    - name: api_service
      connect_timeout: 50.25s
      type: logical_dns
      lb_policy: round_robin
      load_assignment:
        cluster_name: api_service
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: host.docker.internal
                      port_value: 8083
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          common_tls_context:
#            tls_certificates:
#              - certificate_chain:
#                  filename: /etc/server.crt
#                private_key:
#                  filename: /etc/server.pem
#            validation_context:
#              match_subject_alt_names:
#                - exact: "localhost"
#              trusted_ca:
#                filename: /etc/ca.crt

Here's the command I'm using to launch Envoy on Docker:

docker build -t envoy -f api_service/envoy/Dockerfile . && docker run -d -p 8079:8079 --name envoy envoy

Here's my Dockerfile:

FROM envoyproxy/envoy:v1.26-latest
COPY api_service/envoy/envoy.yaml /etc/envoy/envoy.yaml
ADD infra/tls/localhost/server.crt /etc/server.crt
ADD infra/tls/localhost/server.pem /etc/server.pem
ADD infra/tls/localhost/ca.crt /etc/ca.crt
ADD infra/tls/localhost/ca.key /etc/ca.key
ADD infra/tls/localhost/client.crt /etc/client.crt
ADD infra/tls/localhost/client.pem /etc/client.pem
EXPOSE 8079
CMD /usr/local/bin/envoy -c /etc/envoy/envoy.yaml

And here is the exact error I'm getting from my local frontend:

htps://localhost:8079/com.locusive.api_service.v1.ApiService/GetUser 503 (Service Unavailable)

code: 14
message: "Http response at 400 or 500 level, http status code: 503"
metadata: {}
stack: "Error: Http response at 400 or 500 level, http status code: 503\n    at new E (https://localhost:8080/static/js/bundle.js:45550:13)

Here is a list of things I've tried so far:

  1. Removing the http2_protocol_options
  2. Adding the self-signed certs to my local keychain (I'm using a Mac)
  3. Continuously tweaking parts of my Envoy config based on research online, the latest version is what you see above (you can see areas where I've commented things out)

Once this is working, I'd expect to see a successful response from the GRPC server to the frontend just like in the scenario where there is no TLS involved.

Does anyone have any ideas on what might be the cause here? Definitely getting to a point where I'm low on ideas and things to try. Apologies if the answer is obvious, I haven't done much with SSL/Envoy before.

Thanks so much for the help!

shanifdhanani commented 1 year ago

It turns out that in my case, I needed the line for http2_protocol_options: {} in the cluster definition. Once I added that in, all is well.