emissary-ingress / emissary

open source Kubernetes-native API gateway for microservices built on the Envoy Proxy
https://www.getambassador.io
Apache License 2.0
4.32k stars 683 forks source link

Make `emissary-ingress-apiext` optional #4224

Open Sh1ftry opened 2 years ago

Sh1ftry commented 2 years ago

Please describe your use case / problem. I want to install Emissary-ingress without being forced to use emissary-ingress-apiext.

Describe the solution you'd like There is a way to install Emissary-ingress with getambassador.io/v3alpha1 CRDs only.

Additional context I am installing Emissary-ingress on a cluster that do not have any getambassador.io/v2 custom resources, so I don't need support for migrating between versions.

cindymullins-dw commented 2 years ago

Hi @Sh1ftry, could you let us know what kind of issues this creates for you when it is included by default?

Sh1ftry commented 2 years ago

It's an additional component in the cluster, which needs to be monitored and regularly updated. Furthermore, in case of any problems it can cause disruptions in the cluster as mentioned in your documentation:

If the emissary-ingress-apiext Deployment's Pods all stop running, you will not be able to use getambassador.io/v3alpha1 CRDs until restarting the emissary-ingress-apiext Deployment.

The above makes Emissary-ingress less appealing to companies with strict regulations on what can and cannot be deployed to their clusters. This could easily be solved just by allowing users to simply disable emissary-ingress-apiext in case it's not needed.

Alan01252 commented 2 years ago

Agreed I would prefer to have this opt-out if possible, it's an external dependency we don't need as we can manage the CRD install ourselves.

We've also seen weird errors where the CRDs fail with

dry-run failed, error: conversion webhook for getambassador.io/v3alpha1, Kind=Host failed: Post "https://emissary-apiext.emissary-system.svc:443/webhooks/crd-convert?timeout=30s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)...

aaronjpitty commented 2 years ago

Yeah +1 to this request. I experienced a similar issue to what Alan01252 is describing above. This did help (https://stackoverflow.com/a/70608787/18679153) but I shouldn't have to manually change my Firewall rules to permit apiext to run. I'd rather not have the main source for all production traffic to be reliant on a service that might not actually be needed in my particular use case.

I do think it's a fantastic addition, especially with assisting those moving from Version X to Version Y, but I think it'd be a better place to have it as an optional deployment that can be enabled / disabled in the values file (as opposed to installing it via the https://app.getambassador.io/yaml/emissary/2.2.2/emissary-crds.yaml file).

taharah commented 2 years ago

The webhook is configured in the CRD definition. I think if you remove the block shown below from each of the CRDs, then the emissary-ingress-apiext resources should no longer be required. I haven't tested this yet since I ran into this issue when I was vendoring the CRD definitions to our internal environment and saw it included more resources than I expected.

conversion:
  strategy: Webhook
  webhook:
    clientConfig:
      service:
        name: emissary-apiext
        namespace: emissary-system
    conversionReviewVersions:
    - v1beta1
kharf commented 2 years ago

I also came across this and it feels not right to have this included in a crds file. The helm chart didn't have any reference to it. Also it does not have any resource requests, security context and readiness probe defined.

Our problem is that we use a custom namespace called "ingress" for our ingress resource and emissary-ingress (they have to be in the same namespace), which is defined by a HelmRelease. We use flux2+ kustomize to apply those resources. The namespace located in the crds file will be overridden by the defined namespace in a kustomization and applied, but there shouldn't be a namespace resource in the crds file in the first place. According to your documentation, emissary-ingress does not have to be installed in the emissary-system namespace. The helm chart also uses the namespace of the helmrelease. So in my opinion there shouldn't be an apiext deployment and no namespace in the crds file, it should only contain crds and the apiext deployment should be a template in the helm chart. The namespace in the crds shouldn't exist at all as it already exists as a template in the chart.

Helm supports crds: https://helm.sh/docs/chart_best_practices/custom_resource_definitions/#method-1-let-helm-do-it-for-you and https://helm.sh/docs/chart_best_practices/custom_resource_definitions/#method-2-separate-charts. Paired with flux2: https://fluxcd.io/docs/components/helm/helmreleases/#crds See linkerd as an example: https://artifacthub.io/packages/helm/linkerd2/linkerd2?modal=template&template=serviceprofile-crd.yaml

Maybe include that in the HelmChart with resource requests, securityContext, a readiness probe and options to disable it?

caddymofo commented 2 years ago

We've also come across this. We fully migrated from 1.x to 2.x and also believe the api-ext deploy should be optional. Not sure if anyone else noticed this but our kube-api pods get spammed by api-ext for failing to list/watch mapping/module/tlscontext.

E0701 16:39:38.022190       1 cacher.go:419] cacher (*unstructured.Unstructured): unexpected ListAndWatch error: failed to list getambassador.io/v1, Kind=Mapping: conversion webhook for getambassador.io/v2, Kind=Mapping failed: no kind "Mapping" is registered for version "getambassador.io/v1" in scheme "/go/cmd/apiext/main.go:50"; reinitializing...
E0701 16:39:38.117232       1 cacher.go:419] cacher (*unstructured.Unstructured): unexpected ListAndWatch error: failed to list getambassador.io/v1, Kind=Module: conversion webhook for getambassador.io/v2, Kind=Module failed: no kind "Module" is registered for version "getambassador.io/v1" in scheme "/go/cmd/apiext/main.go:50"; reinitializing...
E0701 16:39:38.721206       1 cacher.go:419] cacher (*unstructured.Unstructured): unexpected ListAndWatch error: failed to list getambassador.io/v1, Kind=TLSContext: conversion webhook for getambassador.io/v2, Kind=TLSContext failed: no kind "TLSContext" is registered for version "getambassador.io/v1" in scheme "/go/cmd/apiext/main.go:50"; reinitializing...
E0701 16:39:39.025949       1 cacher.go:419] cacher (*unstructured.Unstructured): unexpected ListAndWatch error: failed to list getambassador.io/v1, Kind=Mapping: conversion webhook for getambassador.io/v2, Kind=Mapping failed: no kind "Mapping" is registered for version "getambassador.io/v1" in scheme "/go/cmd/apiext/main.go:50"; reinitializing...
E0701 16:39:39.121784       1 cacher.go:419] cacher (*unstructured.Unstructured): unexpected ListAndWatch error: failed to list getambassador.io/v1, Kind=Module: conversion webhook for getambassador.io/v2, Kind=Module failed: no kind "Module" is registered for version "getambassador.io/v1" in scheme "/go/cmd/apiext/main.go:50"; reinitializing...
E0701 16:39:39.726207       1 cacher.go:419] cacher (*unstructured.Unstructured): unexpected ListAndWatch error: failed to list getambassador.io/v1, Kind=TLSContext: conversion webhook for getambassador.io/v2, Kind=TLSContext failed: no kind "TLSContext" is registered for version "getambassador.io/v1" in scheme "/go/cmd/apiext/main.go:50"; reinitializing...
Sh1ftry commented 2 years ago

I also had this issue @caddymofo. I fixed it by removing v1 versions from the CRDs https://github.com/emissary-ingress/emissary/blob/327dca4f771c0ec80b263768a12a4098e6944382/manifests/emissary/emissary-crds.yaml.in#L295-L302

This is also something I don't get, what's the point of keeping the v1 versions there?

EdDanileyko commented 1 year ago

Hello, I am doing a fresh installation of emissary-ingress on EKS via helm and am running into issues with the pods being able to POST to the apiext service. I don't have any necessity for v2 CRDs either and I can't figure out what is causing the problems I'm facing. It would be great to make this service optional as sometimes this issue occurs and sometimes not and I cant seem to figure out what the pattern is, but it is becoming a hindrance to my getting started.

phlax commented 1 year ago

reiterating what others have said - installing a new cluster, this was by far the most problematic part of the install.

even now with everything ~working - my argocd installation continually reloads emissary due to crd mismatches - which may/not be related (edit: unrelated)

woz5999 commented 1 year ago

I was investigating why my log ingest bill had a huge, inexplicable jump. Turns out it's due to gigabytes of these logs after an emissary "upgrade" last week:

E0701 16:39:38.022190 1 cacher.go:419] cacher (*unstructured.Unstructured): unexpected ListAndWatch error: failed to list getambassador.io/v1, Kind=Mapping: conversion webhook for getambassador.io/v2, Kind=Mapping failed: no kind "Mapping" is registered for version "getambassador.io/v1" in scheme "/go/cmd/apiext/main.go:50"; reinitializing...

The entire CRD migration path has been an absolute nightmare, time suck, and now money pit. Jeez.

siennathesane commented 1 year ago

This did help (https://stackoverflow.com/a/70608787/18679153) but I shouldn't have to manually change my Firewall rules to permit apiext to run. I'd rather not have the main source for all production traffic to be reliant on a service that might not actually be needed in my particular use case.

I can confirm that adding the port to my cluster's master firewall resolved my issue, but I also agree that I shouldn't be forced to use the migration path if I have no v2 resources.

sharkymcdongles commented 1 year ago

So to me, it seems this api-ext setup is to allow backwards compatibility to not have to carry multiple versions of the CRD for people who use older versions of ambassador/emissary. It also seems like it may convert stuff in the other direction. We are on v2 CRDs atm, so I have patched them myself to remove all v3 and v1 things as well as the webhook. Perhaps these are useful to others: https://gist.github.com/sharkymcdongles/19d4d729ceeda694c3bc2e5737c99ea3

I only tested them with my setup, but it seems to work as far as I have tested so far albeit I just made these change 5 min ago. :D

bchahn commented 1 year ago

+1 for making apiext optional

rootrahulagr commented 5 months ago

+1