operate-first / apps

Operate-first application manifests
GNU General Public License v3.0
51 stars 137 forks source link

Adding route TLS secret to pulp #2662

Closed fao89 closed 1 year ago

fao89 commented 1 year ago

trying ingress to see if the cert works

fao89 commented 1 year ago

/assign @harshad16 /assign @tumido

sesheta commented 1 year ago

Hi @fao89. Thanks for your PR.

I'm waiting for a operate-first member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
harshad16 commented 1 year ago

This will create ingress sets for all the paths, right?

harshad16 commented 1 year ago

/ok-to-test

tumido commented 1 year ago

/hold

fao89 commented 1 year ago

This will create ingress sets for all the paths, right?

yes, one ingress with all paths

tumido commented 1 year ago

So.. let me explain this whole thing, I'll start with the maybe very obvious stuff, so feel free to ignore it if it's too boring and skip ahead. :slightly_smiling_face:

Exposing things via Routes as you did previously created a Route resource per each path. This works fine until you want to secure this via certs. We have a acme-operator deployed on our clusters which can help with cert provisioning. However there are few catches of this operator:

  1. It acts per each Route object with kubernetes.io/tls-acme: "true" annotation. That means the operator requests a new certificate per each Route object, no matter if it's targeting the same host or not. In this case, when Pulp generates many Routes, acme-operator tries to order a new certificate for each. This results in us hitting rate limits on Let's Encrypt side. The limit for the same domain is 5 per week, which is essentially blocking us from properly resolving this for few more days anyways. Since all the Routes generated by Pulp use the same hostname and differ in path, it can also result in interesting troubles when endpoint /a would use different cert than endpoint /b.
  2. acme-operator can act upon Route object only. There's no support for Ingress in the controller (despite outlining this in their arch docs). Their readme states it may support it eventually, but there's no active development of this controller. So, adding the kubernetes.io/tls-acme: "true" annotation would not result in any certificates being requested.

However, we also provide one additional and different method for requesting a certificate - cert-manager. This operator is more kubernetes native than the openshift specific acme-operator and takes a more sensible approach to certificates. It can help you generate a Secret with the certificate using the following pattern:

  1. Since you don't need wildcard certs, we can use the ACME http01 challenge, that simplifies things for us.

  2. We either create a cluster-scoped ClusterIssuer for everybody to consume or use just a namespace scoped Issuer. This resource would look like this:

    apiVersion: cert-manager.io/v1
    kind: Issuer
    metadata:
     name: letsencrypt
    spec:
     acme:
       email: ops-team@operate-first.cloud
       privateKeySecretRef:
         name: letsencrypt-key
       server: 'https://acme-v02.api.letsencrypt.org/directory'
     solvers:
       - http01:
           ingress:
             serviceType: ClusterIP
  3. Then we can simply request a Certificate from Let's Encrypt like this:

    apiVersion: cert-manager.io/v1
    kind: Certificate
    metadata:
     name: pulp
    spec:
     dnsNames:
       - pulp.operate-first.cloud
     issuerRef:
       name: letsencrypt
     secretName: pulp-letsencrypt   
  4. This should result in pulp-letsencrypt Secret being created (or whatever .spec.secretName is set in the Certificate resource.

  5. Then you should be able to mount this Secret into your Ingress:

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
     name: pulp
    spec:
     ...
     tls:
     - hosts:
       - pulp.operate-first.cloud
       secretName: pulp-letsencrypt

However as I'm trying this our right now, it seems the challenge pods are not spawning because of: https://cert-manager.io/docs/release-notes/release-notes-1.10/#on-openshift-the-cert-manager-pods-may-fail-until-you-modify-security-context-constraints

Therefore we may need to use dns01 challenge instead. I'll try setting that up tomorrow (requires a service account from DNS provider). In that scenario you'd still need to do the steps 3.-5. outlined above (with slight modifications).

There is some development on the OCP side going on in order to better integrate Cert Manamager. We should look into that later... It may enable us to use ingress shims (provision certs directly from the Ingress resource via annotations. This doesn't work for me currently).

harshad16 commented 1 year ago

Thank you @tumido , for taking the time and providing a thorough explanation.

/remove-approve /remove-lgtm

fao89 commented 1 year ago

@tumido thanks for the explanation! I'll modify this PR to mount the letsencrypt secret into the routes then https://github.com/pulp/pulp-operator/pull/801 https://docs.pulpproject.org/pulp_operator/configuring/routes/

sesheta commented 1 year ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tumido

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/operate-first/apps/blob/master/OWNERS)~~ [tumido] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment