emissary-ingress / emissary

open source Kubernetes-native API gateway for microservices built on the Envoy Proxy
https://www.getambassador.io
Apache License 2.0
4.35k stars 681 forks source link

Dynamic CORS Origin matching using Envoy's `allow_origin_string_match` #3020

Open neomantra opened 3 years ago

neomantra commented 3 years ago

Please describe your use case / problem.

User has a backend service backend1 exposed by Ambassador, which handles CORS with a static list of allowed origins. The static list includes https://frontend.example.com, which is a frontend Web application which queries backend1. This static list works fine, but...

Upon commit, CI/CD creates an ephemeral "review app" for their frontend hosted at <branch-name>-frontend.example.com. However that URL is not on backend1's Mapping's cors.origins list, so CORS fails.

In essence, a dynamic list of CORS origins is needed.

Describe the solution you'd like

I want to be able to tell Ambassador that backend1 can have any Origin matching *.example.com or even *-frontend.example.com. If the Origin matches, then Access-Control-Allow-Origin is filled with the matching Origin.

Envoy now supports this using allow_origin_string_match which Ambassador already uses, but solely using the exact matcher.

One solution would be to present Envoy's entire StringMatcher API. But, perhaps this is overkill.

I prefer the idea of adding a field regex_origins, a YAML array of regex's that will be added to the CORS filter as safe_regex, just as the existing origins are added to the origins filter as exact matchers. This would give a lot of flexibility without much complexity.

Describe alternatives you've considered

Additional context

There was an issue similar to this before allow_origin_string_match was added to Envoy; it was closed as stale.

neomantra commented 3 years ago

Here's a snippet (untested) that shows the configuration handling I was thinking... I didn't go further than just typing this up as I recognize that this change (adding regex_origins) would require a Mapping schema version bump.

    def setup(self, ir: 'IR', aconf: Config) -> bool:
        # 'origins' cannot be treated like other keys, because if it's a
        # list, then it remains as is, but if it's a string, then it's
        # converted to a list.  It has already been validated by the
        # JSON schema to either be a string or a list of strings.
        origins = self.pop('origins', None)

        if origins is not None:
            if type(origins) is not list:
                origins = origins.split(',')

            self.allow_origin_string_match = [{'exact': origin} for origin in origins]

        # NEW CODE starts here
        # 'regex_origins' is handled 'origins' but without splitting a string into a list
        regex_origins = self.pop('regex_origins', None)

        if regex_origins is not None:
            if type(regex_origins) is not list:
                regex_origins = [regex_origins]
            regex_matchers = [{'safe_regex': regex_origin} for regex_origin in regex_origins]
            if self.allow_origin_string_match is None:
                self.allow_origin_string_match = []
            self.allow_origin_string_match.extend(regex_matchers)

    # ..... rest of setup()
stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

neomantra commented 3 years ago

I have taken an initial stab of this. I was having problems testing it ... I will ask some questions on Slack.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

iNoahNothing commented 3 years ago

@neomantra Exposing Envoy configuration is generally pretty straight forward so I imagine that someone in the community could raise a PR to add support for this.

In the meantime, I have taken a stab at creating a similar function using a lua script. The script below checks if the origin matches example.com or any subdomains of it and sets the access-control-allow-origin accordingly.

    lua_scripts: |
      function envoy_on_request(request_handle)
        local origin = request_handle:headers():get("origin")
        request_handle:streamInfo():dynamicMetadata():set("request_info", "request.origin", origin)
      end
      function envoy_on_response(response_handle)
        local origin = response_handle:streamInfo():dynamicMetadata():get("request_info")["request.origin"]
        if string.find(origin, ".*example.com")
        then
          response_handle:headers():replace("access-control-allow-origin", origin)
        end
      end

I think you should be able to adopt this to support your use cases until we have the Envoy cors configuration options exposed.

shadiramadan commented 1 year ago

I was honestly very surprised to see this was not supported. Istio has supported this for as long as I can remember. Can this be bumped? It looks like the mapping resource silently accepts the subdomain wildcard but then essentially returns an accept origin for a global wildcard. I feel like people have assumed this has worked but it actually isn’t, hence less requests for this feature.

This really should be bumped- it’s an easy win.

The Istio CORS policy supports exact, prefix, and regex. https://istio.io/latest/docs/reference/config/networking/virtual-service/#CorsPolicy

neomantra commented 1 year ago

I haven't used emissary-ingress in years (in part due to this issue), but what you are describing seems like it should be raised as it's own issue (and refer to this).

I'm describing a feature (and showed how it could be implemented), but what you are describing seems like a bug, perhaps with privacy implications? something like "wildcard CORS subdomain handled as an unscoped wildcard CORS" and include supporting example?

kflynn commented 2 months ago

Reopening – I think this shouldn't have been staled out.