envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.71k stars 4.75k forks source link

gcp_authn: Deprecation of http_uri breaks use cases where ?format=full must be used #35651

Open mark-adams opened 1 month ago

mark-adams commented 1 month ago

Title: gcp_authn: Deprecation of http_uri breaks use cases where ?format=full must be used

Description: Prior to #35173, users were able to use the http_uri config to set the URL pattern that should be used when obtaining an ID token from the GCP metadata server. After #35173, the attribute is marked as deprecated and says that it will be going away in a future version.

The normal GCP metadata URL also accepts a ?format=full query parameter which adds additional useful attributes to the ID token that are needed for some use cases. Specifically, the full token format adds the following attributes: email, google.compute_engine.instance_id, google.compute_engine.instance_name, google.compute_engine.project_id, google.compute_engine.project_number, and google.compute_engine.zone.

This full token format is also accessible by running gcloud auth print-identity-token --audiences=<some-audience> --token-format=full.

With the deprecation of http_uri config, there is no longer a non-deprecated way to set the token format that the gcp_authn filter should use which breaks use cases where remote hosts require those attributes to be present in the ID token.

One potential fix would be to introduce a parameter that would allow a user to specify the token format (either standard or full).

Repro steps: I've included a configuration below demonstrating the usage of the http_uri config to set the format=full query parameter. Prior to #35173, you simply added the query parameter to the URL and the appropriate full token would be added to the request as expected.

After #35173, it is necessary to workaround the deprecation by setting envoy.reloadable_features.gcp_authn_use_fixed_url to false

Config:

static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 10000
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          access_log:
          - name: envoy.access_loggers.stdout
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              routes:
              - match:
                  prefix: "/"
                route:
                  host_rewrite_literal: some-host-needing-gcp-auth
                  cluster: remote
          http_filters:
          - name: "envoy.filters.http.gcp_authn"
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.gcp_authn.v3.GcpAuthnFilterConfig
              http_uri:
                uri: "http://metadata/computeMetadata/v1/instance/service-accounts/default/identity?audience=[AUDIENCE]&format=full"
                cluster: "gcp_authn"
                timeout: 10s
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router

  clusters:
  - name: remote
    type: LOGICAL_DNS
    # Comment out the following line to test on v6 networks
    dns_lookup_family: V4_ONLY
    load_assignment:
      cluster_name: remote
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: some-host-needing-gcp-auth
                port_value: 443
    transport_socket:
      name: envoy.transport_sockets.tls
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
        sni: some-host-needing-gcp-auth
    metadata:
      typed_filter_metadata:
        envoy.filters.http.gcp_authn:
          "@type": type.googleapis.com/envoy.extensions.filters.http.gcp_authn.v3.Audience
          url: https://some-host-needing-gcp-auth
  - name: gcp_authn
    type: STRICT_DNS
    connect_timeout: 5s
    dns_lookup_family: V4_ONLY
    load_assignment:
      cluster_name: "gcp_authn"
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: "metadata.google.internal"
                port_value: 80

layered_runtime:
  layers:
  - name: "static"
    static_layer:
      envoy.reloadable_features.gcp_authn_use_fixed_url: false

Logs:

[2024-08-09 19:31:53.072][8769][warning][misc] [source/common/protobuf/message_validator_impl.cc:21] Deprecated field: type envoy.extensions.filters.http.gcp_authn.v3.GcpAuthnFilterConfig Using deprecated option 'envoy.extensions.filters.http.gcp_authn.v3.GcpAuthnFilterConfig.http_uri' from file gcp_authn.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/version_history/version_history for details. If continued use of this field is absolutely necessary, see https://www.envoyproxy.io/docs/envoy/latest/configuration/operations/runtime#using-runtime-overrides-for-deprecated-features for how to apply a temporary and highly discouraged override.
mark-adams commented 1 month ago

cc: @markdroth @tyxia since you both were involved in the discussion on #35173

adisuissa commented 1 month ago

cc @tyxia as codeowner

tyxia commented 1 month ago

Thanks for reporting @mark-adams and sorry for breaking you.

I think one solution could be something like: (1) change the fixed string to include format and licenses field : http://metadata/computeMetadata/v1/instance/service-accounts/default/identity?audience=[AUDIENCE]&format=[FORMAT]&licenses=[LICENSES] (2) Use default value but provide customer with options to configure the format and license filed like what we do for audience

Will this work for you @mark-adams ?

Also cc @markdroth who initially proposed the deprecation API change.

mark-adams commented 1 month ago

@tyxia 👍 That seems like a super reasonable solution to me!

markdroth commented 1 month ago

I think the key question here is, how would you specify this additional query param if you were using the google auth libraries instead of using Envoy? I don't think we want to provide any API here that is different than the API provided by the auth libraries. The reason for #35173 was to ensure that we were not providing any API that could not be supported by a gRPC implementation that is delegating to the google auth libraries for the actual functionality.

@ejona86 @dfawley

mark-adams commented 1 month ago

I'm not sure about the libraries. With gcloud, you just use the --format argument to gcloud auth print-identity-token

mark-adams commented 1 month ago

Most of my day-to-day is in Go so that's the quickest place for me to find an example. It looks like google-cloud-go supports an opts.ComputeTokenFormat option that can be passed to idtoken.NewCredentials()

https://github.com/googleapis/google-cloud-go/blob/60ad7f32d69ed3986fb1326841bba2c5dc36e508/auth/credentials/idtoken/compute.go#L39

markdroth commented 1 month ago

Okay, if the auth libraries provide that option, then it makes sense for xDS to do so as well. So let's add a similar option in the GCP auth filter.

It looks like the auth library specifies the format as basically an enum (https://github.com/googleapis/google-cloud-go/blob/60ad7f32d69ed3986fb1326841bba2c5dc36e508/auth/credentials/idtoken/idtoken.go#L33), so we should probably do the same thing here.

Can you also check the java auth library to make sure the story is the same there? Thanks!

mark-adams commented 1 month ago

No problem! It took some digging but it looks like the Java library uses an option that can be passed in.

Interestingly, Ruby seems to just always call ?format=full (link) as does the NodeJS version.

markdroth commented 1 month ago

Okay, cool. As long as Java and Go have these options, it seems reasonable for us to support them.

For gRPC, Ruby is based on C++, so we wouldn't wind up using the Ruby libraries. And if we ever support this for Node, we can work with the auth libraries team to add the necessary options there.

I think we can proceed with using an enum here.

Thanks for doing this investigation!

tyxia commented 1 month ago

@mark-adams Do you want to go ahead and support this? I am pretty busy recently with some internal priorities that I won't expect to have bandwidth anytime soon.

If you have any questions, please feel free to let me know. Thanks!

mark-adams commented 4 weeks ago

Hey @tyxia! I'd be glad to jump in but it looks like I'm running into some issues getting the build running successfully on my M3 Mac. I've been trying for a bit to get the dev containers environment to work but apparently there's some issue with the toolchain not liking aarch64. Even when I clone it fresh on my workstation directly instead of using dev containers, I'm getting compiler errors (error: call to undeclared function 'eaccess'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]).

Any tips?

tyxia commented 3 weeks ago

@mark-adams have you checked this doc https://www.envoyproxy.io/docs/envoy/latest/start/building ?

From my past experience, I have the luck with adding build --config=remote-clang in user.bazelrc . However, i am not sure if it is directly related to your issue.

I can also add you to envoy developer slack channel to see if anyone else have run into similar issue