ansible / awx-operator

An Ansible AWX operator for Kubernetes built with Operator SDK and Ansible. 🤖
https://www.github.com/ansible/awx
Apache License 2.0
1.24k stars 627 forks source link

LDAP Certificate not loading #649

Open rrobe53 opened 2 years ago

rrobe53 commented 2 years ago

I have a .crt file for an internal CA that I can call against an internal resource, using curl --cafile ca.crt ldaps://xyz:636, and that works in terms of verifying the certificate. However, adding it as a secret and referencing it in the manifest continues to show errors. Running this in k3s.

Creating the secret kubectl -n awx create secret generic awx-custom-certs --from-file=ldap-ca.crt=./ca.crt --from-file=bundle-ca.crt=./ca.crt

secret/awx-custom-certs created

Basic deploy

---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx
  namespace: awx
spec:
  service_type: nodeport
  ldap_cacert_secret: awx-custom-certs
  bundle_cacert_secret: awx-custom-certs
  hostname: xxx
  web_resource_requirements: {}
  ee_resource_requirements: {}
  task_resource_requirements: {}

After deploy I see this in the manager logs, no errors: kubectl -n awx logs deployments/awx-operator-controller-manager -c manager

PLAY RECAP *********************************************************************
localhost                  : ok=58   changed=0    unreachable=0    failed=0    skipped=37   rescued=0    ignored=0

awx-custom-certs in same namespace, again expected since I didn't get any errors from the operator.

kubectl -n awx get awx,all,ingress,secrets,persistentvolume

NAME                      AGE
awx.awx.ansible.com/awx   5m55s

NAME                                                   READY   STATUS    RESTARTS   AGE
pod/awx-operator-controller-manager-68d787cfbd-fnv9c   2/2     Running   0          7m5s
pod/awx-postgres-0                                     1/1     Running   0          5m36s
pod/awx-559fcd895-tfxl9                                4/4     Running   0          5m27s

NAME                                                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
service/awx-operator-controller-manager-metrics-service   ClusterIP   10.43.6.215     <none>        8443/TCP       7m5s
service/awx-postgres                                      ClusterIP   None            <none>        5432/TCP       5m36s
service/awx-service                                       NodePort    10.43.230.186   <none>        80:30098/TCP   5m29s

NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/awx-operator-controller-manager   1/1     1            1           7m5s
deployment.apps/awx                               1/1     1            1           5m27s

NAME                                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/awx-operator-controller-manager-68d787cfbd   1         1         1       7m5s
replicaset.apps/awx-559fcd895                                1         1         1       5m27s

NAME                            READY   AGE
statefulset.apps/awx-postgres   1/1     5m36s

NAME                                                 TYPE                                  DATA   AGE
secret/default-token-bdtcs                           kubernetes.io/service-account-token   3      7m5s
secret/awx-operator-controller-manager-token-t4bcn   kubernetes.io/service-account-token   3      7m5s
secret/awx-custom-certs                              Opaque                                2      6m26s
secret/awx-app-credentials                           Opaque                                3      5m32s
secret/awx-token-qtdrv                               kubernetes.io/service-account-token   3      5m31s
secret/awx-admin-password                            Opaque                                1      5m45s
secret/awx-secret-key                                Opaque                                1      5m49s
secret/awx-postgres-configuration                    Opaque                                6      5m38s
secret/awx-broadcast-websocket                       Opaque                                1      5m42s

NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                         STORAGECLASS   REASON   AGE
persistentvolume/pvc-a88f849b-93e3-4014-8a6b-1b4e63762bc0   8Gi        RWO            Delete           Bound    awx/postgres-awx-postgres-0   local-path              5m34s

Yet LDAPS still doesn't function

kubectl -n awx logs awx-559fcd895-tfxl9 -c awx-web

2021-11-11 13:34:12,888 WARNING  [e1a9950bf56e4dac94410d3c1a42a4aa] django_auth_ldap Caught LDAPError while authenticating xxxx: SERVER_DOWN({'result': -1, 'desc': "Can't contact LDAP server", 'ctrls': [], 'info': 'error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed (unable to get issuer certificate)'})
chrismeyersfsu commented 2 years ago

@rrobe53 can you try asking on the mailing list too and report back here if you find the answer?

rrobe53 commented 2 years ago

Added it to the mailing list.

rrobe53 commented 2 years ago

I also duplicated this test on my Mac using Minikube and hyperkit with the same results, to take K3S out of the equation.

hungtran84 commented 2 years ago

Could you share your LDAP configuration? And also specify the version of operator and awx that you are using? I got ldap setup worked via extra setting with awx 19.4.0 and operator built from my PR.

You can checkout my branch in this PR for reference. It did work on my Azure AKS with the ldap ca cert synced from Azure Keyvault.

https://github.com/ansible/awx-operator/pull/659

rrobe53 commented 2 years ago

Using 19.4.0 and 0.14.0 in the initial test, just tried again with the k3s environment and 19.5.0 and 0.15.0 with the same result.

I'm using the same LDAP setup that's working on a much older version (8.0.0). I'm not applying it with extra settings, just the bare deploy above. I'm able to get the LDAP failure message by configuring essentially nothing but the LDAP server name in the LDAP settings. However I've copied everything else. I use the same ca.crt in curl and https://ldapserver:636 and it works (past the ssl handshake at least).

forzamehlano commented 2 years ago

I had a very similar issue with CA certs. I had to provide the entire CA chain as the input to the secrets. If I provided just the CA cert, I had the same error as you. When I provided the CA and the root CA certificate, things magically started to work.

f22l2 commented 2 years ago

@rrobe53 you've provided a lot of information however I cannot see the results from container. Are certs properly propagated? I've reproduced that on k3s - certificate is not getting properly updated in the container and that leads to 'unable to verify the first certificate issue' and when trying to use ldaps: SERVER_DOWN({'result': -1, 'desc': "Can't contact LDAP server", 'ctrls': [],

marcinsiembida commented 22 hours ago

do you have solution @rrobe53 ? i know issue is old, but it is not marked as done

EDIT: I found solution, maybe it is simple, but it wasn't easy to find to make LDAPS work, beside settings right path so ldaps://url:port you need to insert, starting from top ldap-ca.crt: root, inter, ldap certs same as for bundle bundle-ca.crt: root, inter, server cert