algosup / 2022-2023-project-2-santa-time-Project-5-group

1 stars 3 forks source link

Ingress deployment issue #78

Closed Guillaume-Riviere closed 1 year ago

Guillaume-Riviere commented 1 year ago

Describe the bug

Ingress deployment fail because error or cert manager

To Reproduce

Steps to reproduce the behavior:

  1. Run the workflow

Expected behavior

Should deploy the app and activate SSL on the cluster

Area

Server / Cluster / SSL / Docker

Priority

Critical

Machine

/

Screenshots

/

Guillaume-Riviere commented 1 year ago

@tondrejk I saw you star our repo, may you help us on that ? :D

tomondre commented 1 year ago

What ingress controller do you use in the cluster? :smile:

tomondre commented 1 year ago

You can try this: https://kubernetes.io/docs/concepts/services-networking/ingress/#deprecated-annotation

PaulMarisOUMary commented 1 year ago

We are using https://github.com/algosup/2022-2023-project-2-santa-time-Project-5-group/blob/ced7d5b6e61e4c337f1eb3a6c7a563a5b4570cee/.k8s/xmas-ingress.yml#L1-L27

tomondre commented 1 year ago

Try to remove the kubernetes.io/ingress.class: "nginx" annotation and redeploy :smile:

PaulMarisOUMary commented 1 year ago

The ingress is meant to use cert-manager, so we can use the https. We're a bit lost about it.

tomondre commented 1 year ago

I will try to look into it after work today :smile: . So the current error that is thrown in the pipeline is: Error from server (InternalError): error when creating "/tmp/xmas-ingress.yml": Internal error occurred: failed calling webhook. Could you shell into one of your containers and run nslookup quickstart-ingress-nginx-controller-admission.default.svc? For now it looks like that service called quickstart-ingress-nginx-controller-admission in default namespace either does not exist or cannot be reached.

tomondre commented 1 year ago

There is a difference between Ingress Controller and cert-manager. Ingress controller is used to help you forward the traffic to services and pods and cert-manager is often used to terminate ssl certificates on the ingress level - with nginx. You can check out this article to have an understanding of the nginx with cert-manager: https://www.digitalocean.com/community/tutorials/how-to-set-up-an-nginx-ingress-with-cert-manager-on-digitalocean-kubernetes. But I would advise you to deploy the resources via Helm Chart so that you don't need to setup every single resource manually - the Helm Chart helps you to create them at one deploy.

tomondre commented 1 year ago

I believe that the Nginx ingress controller is not installed in your cluster. This is a prerequisite for the ingress resource to work, so I would advise you to follow a tutorial on how to set it up. E.g.: https://spacelift.io/blog/kubernetes-ingress

PaulMarisOUMary commented 1 year ago

Thanks, According to the DigitalOcean tutorial on nginx-ingress you sent, I'm editing our manifests in .k8s/

The controller seems to be created during the deployement.

I have however an additional constraint, we're using the automated deployment of Azure and we're deploying using only with our manifests, inside .k8s/

image

tomondre commented 1 year ago

This looks good. Previously the error was service "quickstart-ingress-nginx-controller-admission" not found and now it is: no endpoints available for service "ingress-nginx-controller-admission". This stack overflow may help: https://stackoverflow.com/questions/61365202/nginx-ingress-service-ingress-nginx-controller-admission-not-found. Btw I have a question - if the only thing you want to do is to expose the service, would it be better to use the service of type LoadBalancer in the service. I see that you already have the service of type LoadBalancer. That means you should see created load balancer in your Azure account that you can use to access the xmas service. Can you see the Load Balancer in your account?

PaulMarisOUMary commented 1 year ago

I'm thinking about starting over, my workflow on azure had a different naming style, and when I changed many things in the (forked) xmas-ressource.yml it breaks few things.

By starting over I could change the naming style of my workflow on azure, and so I won't change the forked xmas-ressource.yml. I'll do it Soonβ„’ (asap, I hope this evening)

tomondre commented 1 year ago

Okay, let me know if there is something that I can help with. Kubernetes has a super steep learning curve but you won't regret the journey :smile:

PaulMarisOUMary commented 1 year ago

Thank you, indeed it seems really interesting, I can't wait to master the concept

To be honest, the "only reason" we're using the ingress is to finally be able to use https on our website.

I have "access", I mean have a report status, and can see few details and informations about: LoadBalancer

Deployment

ClusterIP

tomondre commented 1 year ago

In case the only two requirements are: expose service via HTTPS. I would advise you to follow this guide. It does not include creation of an ingress resource nor ingress controller. It is the simplest way how to expose a service. https://learn.microsoft.com/en-us/azure/aks/load-balancer-standard

tomondre commented 1 year ago

When you set Service type to LoadBalancer, a new Load Balancer will be provisioned in your Azure account that will be able to direct the traffic to your pods.

PaulMarisOUMary commented 1 year ago

Thank you for the hint, the major issue I encounter is that I'm not able to use the azure CLI and connect to my kubectl environment to send commands.

I feel I'm about to barely deploy everything the right way using the tutorial from Digital Ocean. UPDATE, I manage to fix the webhook error,

Webhook error (fixed) It seems the webhook has the wrong URL: - Expected: `https://ingress-nginx-controller-admission.k8s-xmas-workflow.svc/networking/v1/ingresses?timeout=10s` - Current: `https://k8s-xmas-workflow-controller-admission.k8s-xmas-workflow.svc/networking/v1/ingresses?timeout=10s` From the [workflow error](https://github.com/algosup/2022-2023-project-2-santa-time-Project-5-group/actions/runs/3693191130/jobs/6253051599#step:5:120): ```bash Error from server (InternalError): error when creating "/tmp/xmas-ingress.yml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": failed to call webhook: Post "[https://k8s-xmas-workflow-controller-admission.k8s-xmas-workflow.svc:443/networking/v1/ingresses?timeout=10s](https://k8s-xmas-workflow-controller-admission.k8s-xmas-workflow.svc/networking/v1/ingresses?timeout=10s)": service "k8s-xmas-workflow-controller-admission" not found ``` Do you know a way to change the webhook URL in [.k8s/xmas-ressource.yml](https://github.com/algosup/2022-2023-project-2-santa-time-Project-5-group/blob/main/.k8s/xmas-ressource.yml) Fixed using [this solution](https://stackoverflow.com/a/64872084/14327609): `$ kubectl get validatingwebhookconfigurations` ``` NAME WEBHOOKS AGE aks-node-validating-webhook 1 15d ingress-nginx-admission 1 15h k8s-xmas-workflow-admission 1 15h nginx-ingress-nginx-admission 1 12h quickstart-ingress-nginx-admission 1 42h ``` `$ kubectl delete validatingwebhookconfigurations [configuration-name]`
tomondre commented 1 year ago

Cool! So what is the current state of the issue? Is it fixed?

PaulMarisOUMary commented 1 year ago

Well, I follow the Step 4 from DigitalOcean

The workflow now, raise this error:

the namespace from the provided object "cert-manager" does not match the namespace "ingress-nginx". You must pass '--namespace=cert-manager' to perform this operation.

It seems normal and understandable, however I don't know how to change the context when the workflow try to apply the .k8s/xmas-cert.yml manifest. I need to change the namespace during the "apply" of this specific manifest.

Otherwise (and I think I'm going this way) I would need to access it through the CLI UPDATE I manage to fix my CLI issue

CLI issue (fixed) For some reason, it can't find the Azure ResourceGroup we're sharing among us, using my account. Using: ``` $ az login $ az aks get-credentials --name SantaKubernetes --resource-group prj5 (ResourceGroupNotFound) Resource group 'prj5' could not be found. Code: ResourceGroupNotFound Message: Resource group 'prj5' could not be found. Fixed with: ```bash # 1. Recreate a Cloud Shell from scratch on Azure # (if you have already created a shell Reset your User settings) # 2. Go on advanced settings # 3. Select the right Resource and Subscription # 4. Type the following $ az account set --subscription [subscription_id] $ az aks get-credentials --resource-group [resource_name] --name [cluster_name] Merged "cluster_name" as current context in /home/user/.kube/config $ kubectl get scv --namespace [namespace] NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ingress-nginx-controller LoadBalancer 10.0.217.209 20.xx.xx.xx 80:31321/TCP,443:30491/TCP 5h10m ingress-nginx-controller-admission ClusterIP 10.0.179.64 443/TCP 5h10m xmas-service LoadBalancer 10.0.42.155 20.xx.xx.xx 80:30486/TCP 5h8m ```

I'm now finally able to use the CLI πŸŽ‰

PaulMarisOUMary commented 1 year ago

Everything is configured now.

Services: image

Ingresses: image

We're not linked yet to xmas.algosup.com so I can't make sure everything is working properly.

Anyway do you think of the 404 error on 20.81.0.63 is normal ? May it due to the DNS not yet linked to our IP adress ? @tomondre

tomondre commented 1 year ago

404 is most likely returned from nginx, which is a good sign. There are two possibilities of why the 404 is returned:

  1. The pod that you have deployed does not have any site configured on the url that you have called (for example "/")
  2. The ingress is not configured correctly - either the host is wrong or the service that you proxy the traffic to does not exist.
tomondre commented 1 year ago

In ingress you have configured: - host: xmas.algosup.com That means that you need to call the load balancer via this dns - you should create a xmas.algosup.com CNAME with value of dns record of your load balancer.

PaulMarisOUMary commented 1 year ago

Since I'm not the owner of xmas.algosup.com, and I need to request the DNS to be changed, I changed the host to xmas.warn.page(domain name that I own) so I'll be able to test more often.

Right now I've changed the host: https://github.com/algosup/2022-2023-project-2-santa-time-Project-5-group/blob/a1f4f7fff52345ba999bd752efac020a6ad750e7/.k8s/xmas-ingress.yml#L10-L14

And here are the DNS records: image I assumed that the External IP of my LoadBalancer was the right one to link with CNAME record, but I'm not sure. image

tomondre commented 1 year ago

CNAME points to another dns record - it cannot point to an ip address. To make the xmas.warn.page record correct, you should use xmas.warn.page A record to be pointed to 20.81.0.63

tomondre commented 1 year ago

After you do that, you should get the same result from both records: xmas.warn.page, xmas.algosup.com

PaulMarisOUMary commented 1 year ago

First big step forward! πŸŽ‰

xmas.algosup.com is now available through port 443 ! Thank you so much 😁

Now I need to deal with cert-manager

tomondre commented 1 year ago

Now you can call https://xmas.warn.page/ and also get a response :). You also need to setup the certificate for it if you want to have correct certificate set :smile:

PaulMarisOUMary commented 1 year ago

Thanks for your help, on behalf of the entire teamπŸ˜ƒ

I also want to apologize about the way we have deployed our solution, and our bad practices. I'm 200% aware of it, and I would have especially loved a real continuous integration, and above all using preview deployment on pull request to avoid errors on main. We were in a rush, we'll do better next time. πŸš€