kubernetes / ingress-nginx

Ingress NGINX Controller for Kubernetes
https://kubernetes.github.io/ingress-nginx/
Apache License 2.0
17.51k stars 8.25k forks source link

feature allow change for only clusterrole name #8936

Closed ohauer closed 2 months ago

ohauer commented 2 years ago

What happened:

Installing multiple ingress-nginx with the helm chart in different namespaces are overwriting ClusterRole and ClusterRoleBinding as they are not uniq per namespace.

What you expected to happen: Having ability to install multiple ingress-nginx in dedicated namespaces with uniq ClusterRoles / ClusterRoleBinding

Environment:

How to reproduce this issue: Given a shared dev cluster with dedicated ingress per team (installed with one umbrella helm chart and no dedicated values per team)

Issue: ClusterRole and ClusterRoleBinding are not uniq enough, as they are have only different values in subjects.namespace. Rolling out XX equal ingress with ArgoCD is not possible because of this, but there is a simple fix by using {{ include "ingress-nginx.fullname" . }}-{{ .Release.Namespace }}

The cluster scope elements Pod Security Policy and Ingress Class pose less of a problem because they are always the same.

k8s-ci-robot commented 2 years ago

@ohauer: This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
ohauer commented 2 years ago

For testing use the following helm value file:

fullnameOverride: ingress-nginx
controller:
  ingressClassResource:
    enabled: false
    default: false
  scope:
    enabled: true
  admissionWebhooks:
    enabled: false
rbac:
  create: true
  scope: false
helm -n ns1 template ingress-nginx <ingress-nginx chart> -f <values from above> > ns1.yaml
helm -n ns2 template ingress-nginx <ingress-nginx chart> -f <values from above> > ns2.yaml
helm -n ns3 template ingress-nginx <fixed ingress-nginx chart with pull request #8937 > -f <values from above> > ns3.yaml

vimdiff ns1.yaml ns2.yaml ns3.yaml

compare sections ClusterRole and ClusterRoleBinding in the vimdiff

longwuyuan commented 2 years ago

/remove-kind bug

Can you describe more on what you are trying to achieve with precise specifics. A clusterRole by definition is not scoped to a namespace, to state the obvious, to start with.

And then on top of that a cluster scoped object like clusterRole can be shared by multiple instances of the controller, as documented here https://kubernetes.github.io/ingress-nginx/#how-to-easily-install-multiple-instances-of-the-ingress-nginx-controller-in-the-same-cluster

ohauer commented 2 years ago

This will work, however for automation it is a nightmare to maintain 40+ dedicated ingress installations per cluster with the solution from documentation plus maintaining the value files since parameters are not declarative (installation with Argo from git repo) With the simple fix it is possible to write one umbrella chart (with values from above) and apply it to several namespaces without any isues

longwuyuan commented 2 years ago

I am not convinced that its a problem of the controller, that needs a fix. That example shows parameters that can go into a values file too. That way you can just maintain 40+ values files. On a different note, 40 instances of the controller on a single cluster implies 40 external-ip-addresses. It also means idling pods of the controller usually but may not be in your case, That detail is missing in your posts. There are reports of 2000+ ingresses in one cluster that run on a single controller with more than 1 replicas, as far as distribution of load is concerned. Namespace based isolation for tenancy can be debated.

ohauer commented 2 years ago

Sorry for late response but I was busy the last days with other stuff.

You are right this is not a a change for the controller itself but for the helm chart.

I will describe a little bit more our environment:

Senario one: One central ingress controller per cluster as DeamonSet or Replicaset, and DNS via wildcard or external-dns plugin. This szenario was working but not well as it was not flexible and requires all the time adjustments on the DNS if nodes where scaled / replaced or ingress was not bound to node labels ... Usage of external-dns plugin not possible because we often expirienced on client and "secondary DNS" resolvers negative TTL set for entries replaced during the DNS refresh.

Szenario two: I tested over the last evenings the proposed scenario on a smaler cluster with 3 out of 52 namespaces and installed the ingress-controller like described in the documentation. https://kubernetes.github.io/ingress-nginx/#how-to-easily-install-multiple-instances-of-the-ingress-nginx-controller-in-the-same-cluster

  helm template ingress-<ns> charts/ingress-nginx/  \
  --namespace ns1 \
  --set controller.ingressClassResource.name=nginx-<ns> \
  --set controller.ingressClassResource.controllerValue="<ns>/ingress-nginx" \
  --set controller.ingressClassResource.enabled=true \
  --set controller.ingressClassByName=true 

With this we observed the following. Every ingress-controller has access to all ingress routes, polls them in short frequence to find out which ingress routes should be served. It looks like the ingress controller does not set a filter which ingressClass should be served before querying etcd and all the time get a list of all ingress regardless the ingressClass. With 3 out of 52 ingress used in the test we already saw an increas of querys and higher load on the etcd. As second experiment I enabled for this 3 namespaces the validating webhook, with the result that the ingress contoller died / stopped workin in short time (I've done no further inspection)

Szenario three (what we are using since longer time): Install ingressClass and PSP once per cluster. Install ingress controller per namespace with scope enabled, the values from https://github.com/kubernetes/ingress-nginx/issues/8936#issuecomment-1219522366 and the proposed patch https://github.com/kubernetes/ingress-nginx/pull/8937 With this we have maximum silence on the ingress controllers and nearly zero issues with negative DNS TTLs (perhaps once in a quater)

longwuyuan commented 2 years ago

Hi @ohauer thanks for your patience is helping simplify the problem you are facing. To begin fresh, you said clusterroles are overwritten so I am showing a simple test that clusterroles are not overwritten and a new clusterrole is created ;

% k get ns                                                                                                                                                     
NAME              STATUS   AGE                                                                                                                                 
default           Active   8d                                                                                                                                  
ingress-nginx     Active   8d                                                                                                                                  
kube-node-lease   Active   8d                                                                                                                                  
kube-public       Active   8d
kube-system       Active   8d
metallb-system    Active   8d
[~]                                                                                                                                                            
% helm ls -A                                                                                                                                                   
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION                                                   
ingress-nginx   ingress-nginx   2               2022-08-23 13:07:12.484598036 +0530 IST deployed        ingress-nginx-4.2.3     1.3.0                                                         
[~]                                                                                                                                                            
% k get clusterrole | grep ingres
ingress-nginx                                                          2022-08-17T06:38:40Z                                                                                                   
[~]                              
% kubectl create namespace ingress-nginx-2                                                     
namespace/ingress-nginx-2 created                                              
[~]                          
% helm install ingress-nginx-2 ingress-nginx/ingress-nginx  \
--namespace ingress-nginx-2 \                                                                                                                                  
--set controller.ingressClassResource.name=nginx-two \
--set controller.ingressClassResource.controllerValue="example.com/ingress-nginx-2" \                                                                                                         
--set controller.ingressClassResource.enabled=true \                                           
--set controller.ingressClassByName=true                                                       

NAME: ingress-nginx-2     
LAST DEPLOYED: Thu Aug 25 13:05:20 2022                                                        
NAMESPACE: ingress-nginx-2
STATUS: deployed
REVISION: 1
TEST SUITE: None                                                               
NOTES:                                                                         
The ingress-nginx controller has been installed.                                                                                                               
It may take a few minutes for the LoadBalancer IP to be available.                             
You can watch the status by running 'kubectl --namespace ingress-nginx-2 get services -o wide -w ingress-nginx-2-controller'                                                                  

An example Ingress that makes use of the controller:                                           
  apiVersion: networking.k8s.io/v1
  kind: Ingress  
  metadata:       
    name: example
    namespace: foo             
  spec:   
    ingressClassName: nginx-two
    rules:   
      - host: www.example.com
        http:                 
          paths:      
            - pathType: Prefix
              backend:                
                service:
                  name: exampleService
                  port:
                    number: 80                                                 
              path: /
    # This section is only required if TLS is to be enabled for the Ingress                    
    tls:                 
      - hosts:                 
        - www.example.com                      
        secretName: example-tls                                                                                                                                

If TLS is enabled for the Ingress, a Secret containing the certificate and key must also be provided:                                                                                         

  apiVersion: v1
  kind: Secret       
  metadata:       
    name: example-tls
    namespace: foo                
  data:                          
    tls.crt: <base64 encoded cert>
    tls.key: <base64 encoded key>
  type: kubernetes.io/tls                                                                                                                                      
[~]                                                                                                                                                            
% k get clusterrole | grep ingres                                                                                                                                                             
ingress-nginx                                                          2022-08-17T06:38:40Z                                                                                                   
ingress-nginx-2                                                        2022-08-25T07:35:42Z           
- Also let me show you that there are 2 services of type LoadBalancer now 

% k get svc -A | grep -i load ingress-nginx-2 ingress-nginx-2-controller LoadBalancer 10.109.246.93 192.168.39.26 80:31882/TCP,443:30752/TCP 25m ingress-nginx ingress-nginx-controller LoadBalancer 10.99.171.252 192.168.39.25 80:32203/TCP,443:31450/TCP 8d


After some time, I will post the test of creating 2 deployments and I will expose each deployment with a ingress object using different ingress class.

But I suggest you do the same and post your findings.
ohauer commented 2 years ago

@longwuyuan plz. give me some time will do some additional tests during the next week. One quick test I've done some min. before with the proposed parameters ended in 2800+ log entries on that ingress controller in 5 seconds, setting additional

--set controller.admissionWebhooks.enabled=false \
--set controller.scope.enabled=true \

helped here.

Will see how I can bring this in a few namespaces, but i already know that this will end in endless discussions with the architecture teams and the developers ;(

longwuyuan commented 2 years ago

I want to help any way I can. So far I think you understand the problem details a lot. Just that that information needs to get converted into a clear and precise description of what the bug/problem is, in the controller, Once that bug/problem is known, then talks can happen if it needs to be fixed and if yes, then when.

But as far as new features are concerned, the project got out of hand enough for us to initiate a long-term stabilization effort. So adding a new feature for only your use case and then supporting it long-term is not feasible until after January 2023.

And if it about namespace scoping, then we already know its work-in-progress but will likely get done only after Januray 2023.

philomory commented 2 years ago

For our part, simply the ability to override the name of the ClusterRole and ClusterRoleBinding objects independently of the chart "fullname" field would be nice.

longwuyuan commented 2 years ago

@philomory Then you have to comment on the info I have provided. If you see above, I posted this ;

% k get clusterrole | grep ingres                                                                                                                                                             
ingress-nginx                                                          2022-08-17T06:38:40Z                                                                                                   
ingress-nginx-2                                                        2022-08-25T07:35:42Z           

This information shows that the clusterrole names are not fixed to just ingress-nginx and that doing the install of the second-instance of the ingress-nginx-controller as per the link provided earlier, created a clusterrole with a different name already. So the ability you mentioned already exists, even at this moment.

philomory commented 2 years ago

@longwuyuan Right, but that requires changing the name of every single resource in the entire chart. The deployment is named ingress-nginx-2-controller, the service is named ingress-nginx-2-controller, etc. I'd like to override just the name of the ClusterRole and ClusterRoleBinding. The other resources don't need weird names with -2 in them, because only the ClusterRole and ClusterRoleBinding are cluster-wide resources, everything else is namespaced (well, the IngressClass is also a cluster-wide resource, but we can already customize the name of the IngressClass independently of the release name).

longwuyuan commented 2 years ago

@philomory , thank you very much. I think this last update from you is the appropriate description of this issue. That amounts to a feature that will help you a lot in your use case.

Need to wait for developer time on this.

philomory commented 2 years ago

This is especially relevant with the ClusterRoleBinding, because it's a cluster-wide resource that is essentially namespace-specific (insofar as the ClusterRoleBinding created by the ingress-nginx chart bind the ClusterRole to a ServiceAccount in a specific namespace).

longwuyuan commented 2 years ago

/retitle feature allow change for only clusterrole name

ThumbGen commented 2 years ago

Would this be prioritized and implemented at some point? Would massively help us deploying different versions of our (umbrella) chart on the same cluster, in different namespaces.

ali-idrizi commented 2 years ago

Came across this today as well. I am using skaffold to release in different namespaces, the name of the release is always ingress-nginx. This chart tries to create two cluster roles with the same name, which fails. As a workaround I am simply setting the nameOverride to the name of the namespace, but that seems ugly as @philomory explained above.

longwuyuan commented 2 months ago

This link provides info on how to install multiple instances of the controller in one cluster without ArgoCD. This works as it is tested.

So this issue is mostly related to the integration with ArgoCD and not a problem or a bug in the controller per-se. The project is moving towards removing less-used features and increased focus on implementing and compliance with the K8S KEP Ingress API specs.

Maybe ArgoCD can find a way to install multiple instance of the controller following the same flow as the manual helm install process. I will close this issue for now as this is just adding to the tally of open issues without any traction and hardly any scope for progressing to a change/fix in the controller. If there is data discovered that shows a bug or a problem in the ocntroller, plesae provide that data here before re-openeing this issue. thanks

/close

k8s-ci-robot commented 2 months ago

@longwuyuan: Closing this issue.

In response to [this](https://github.com/kubernetes/ingress-nginx/issues/8936#issuecomment-2327990447): >This link provides info on how to install multiple instances of the controller in one cluster without ArgoCD. This works as it is tested. > >So this issue is mostly related to the integration with ArgoCD and not a problem or a bug in the controller per-se. The project is moving towards removing less-used features and increased focus on implementing and compliance with the K8S KEP Ingress API specs. > >Maybe ArgoCD can find a way to install multiple instance of the controller following the same flow as the manual helm install process. I will close this issue for now as this is just adding to the tally of open issues without any traction and hardly any scope for progressing to a change/fix in the controller. If there is data discovered that shows a bug or a problem in the ocntroller, plesae provide that data here before re-openeing this issue. thanks > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.