Open Reign1 opened 1 month ago
Have you used a customized values.yaml file to enable the AWX resource?
Are the postgress and awx-task pods creating?
Have you used a customized values.yaml file to enable the AWX resource?
Are the postgress and awx-task pods creating?
@YaronL16 , I only did what's provided in the Helm install instructions here: https://ansible.readthedocs.io/projects/awx-operator/en/latest/installation/helm-install-on-existing-cluster.html , and also did this "kubectl -n awx apply -f awx-demo.yaml". Content of awx-demo.yaml provided above. I would expect Helm install document to be complete (eg. you get front end exposed). If it's not - what's missing? Thanks!
Have you used a customized values.yaml file to enable the AWX resource? Are the postgress and awx-task pods creating?
@YaronL16 , I only did what's provided in the Helm install instructions here: https://ansible.readthedocs.io/projects/awx-operator/en/latest/installation/helm-install-on-existing-cluster.html , and also did this "kubectl -n awx apply -f awx-demo.yaml". Content of awx-demo.yaml provided above. I would expect Helm install document to be complete (eg. you get front end exposed). If it's not - what's missing? Thanks!
Well technically you did install the Operator, you just havent told it to set up the AWX resource.
But I agree the documentation is a bit lackluster. Anyway, as it says on the documentation, you should customize the installation with your own values file to overwrite the default ones. Most importantly set AWX.enabled to 'true'.
More info here: https://github.com/ansible/awx-operator/blob/devel/.helm/starter/README.md
Have you used a customized values.yaml file to enable the AWX resource? Are the postgress and awx-task pods creating?
@YaronL16 , I only did what's provided in the Helm install instructions here: https://ansible.readthedocs.io/projects/awx-operator/en/latest/installation/helm-install-on-existing-cluster.html , and also did this "kubectl -n awx apply -f awx-demo.yaml". Content of awx-demo.yaml provided above. I would expect Helm install document to be complete (eg. you get front end exposed). If it's not - what's missing? Thanks!
Well technically you did install the Operator, you just havent told it to set up the AWX resource.
But I agree the documentation is a bit lackluster. Anyway, as it says on the documentation, you should customize the installation with your own values file to overwrite the default ones. Most importantly set AWX.enabled to 'true'.
More info here: https://github.com/ansible/awx-operator/blob/devel/.helm/starter/README.md
@YaronL16 thanks for the input, really helpful and everything makes more sense now. Indeed I did Help install without -f passing my own values. What is still not clear though is content of myvalues.yaml. What is the very minimum to have frontend exposed and be able to login as admin?
AWX:
enabled: true
Is this it?
@YaronL16 thanks for the input, really helpful and everything makes more sense now. Indeed I did Help install without -f passing my own values. What is still not clear though is content of myvalues.yaml. What is the very minimum to have frontend exposed and be able to login as admin?
AWX: enabled: true
Is this it?
I would have something like this at the minimum:
---
AWX:
enabled: true
name: awx-demo
spec:
service_type: ClusterIP
@kurokobo created a nice base values file as seen here: https://github.com/kurokobo/awx-on-k3s/blob/main/base/awx.yaml
You could also define custom images and other configs
I have similar problem on existing EKS cluster. Kubernetes and AWS-nodes are up-to-date.
Using following kustomization:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
## Specify a custom namespace in which to install AWX
namespace: awx
generatorOptions:
disableNameSuffixHash: true
secretGenerator:
### Postgesql secret was moved to awx-secrets.yaml which is included in resources
- name: awx-admin-password
type: Opaque
literals:
- password=BlaBlaBla
- name: my-ca-bundle
type: Opaque
files:
- bundle-ca.crt
resources:
## Find the latest tag here: https://github.com/ansible/awx-operator/releases
- github.com/ansible/awx-operator/config/default?ref=2.16.1
- awx-secrets.yaml
- awx-custom-ee-docker-reg-secret.yaml
- awx-coredns-cm.yaml
- awx-gp3-sc-retain.yaml
- awx-efs-sc.yaml
# - awx-efs-pv.yaml
- awx-efs-pv-pg15.yaml
- awx-efs-pvc.yaml
- awx-with-postgres.yaml
## Set the image tags to match the git version from above
images:
- name: quay.io/ansible/awx-operator
newTag: 2.16.1
Customizing resources with this manifest:
---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
name: awx-dev
spec:
## These parameters are designed for use with:
## - AWX Operator: 2.10
## https://github.com/ansible/awx-operator/blob/2.10.0/README.md
## - AWX: 23.6.0
## https://github.com/ansible/awx/blob/23.6.0/INSTALL.md
##
## Upgraded to:
## - AWX Operator: 2.16.1
## https://github.com/ansible/awx-operator/blob/2.16.1/README.md
## - AWX: 24.3.1
## https://github.com/ansible/awx/blob/24.3.1/INSTALL.md
## This line controls the log output of the deployment
no_log: false
## Disable ip_v6
ipv6_disabled: true
##################################
## awx ##
##################################
admin_user: admin
admin_password_secret: awx-admin-password
bundle_cacert_secret: my-ca-bundle
## hostname value is used in the ALB Listener rules
## if host is equal to <hostname value> then traffic will be forwarded to Target Group
hostname: awx-dev.mydom.com
## Customized control-plane-ee
control_plane_ee_image: myrepo/my-awx-ee:2.16.1_1
## Customized awx-ee
ee_images:
- name: custom-awx-ee
image: myrepo/my-awx-ee:2.16.1_1
## Custom ee docker pull secret
image_pull_secrets:
- awx-custom-ee-docker-reg-secret
## console listens on nodes port so ALB ingress can be used
service_type: NodePort
nodeport_port: 30080
## make projects data persistent on EFS
## need storage class, filesystem & mount points on all subnets to be pre-configured
projects_persistence: true
# ## use either -
# ## 'projects_storage_class' for dynamic allocation of persistent volume
# ## 'projects_existing_claim' for pre-configured persistent volume claim
# projects_storage_class: efs-projects-storageclass
# projects_existing_claim: awx-projects-claim
##################################
## ingress ##
##################################
ingress_type: ingress
ingress_path: '/'
ingress_path_type: Prefix
ingress_annotations: |
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}, {"HTTP":80}]'
alb.ingress.kubernetes.io/actions.ssl-redirect: '{"Type": "redirect", "RedirectConfig": { "Protocol": "HTTPS", "Port": "443", "StatusCode": "HTTP_301"}}'
alb.ingress.kubernetes.io/certificate-arn: "arn:aws:acm:xxxxxxxxxxxxxxxxxx"
alb.ingress.kubernetes.io/ssl-policy: 'ELBSecurityPolicy-TLS13-1-2-Res-2021-06'
alb.ingress.kubernetes.io/scheme: 'internal'
alb.ingress.kubernetes.io/target-type: 'instance'
alb.ingress.kubernetes.io/ip-address-type: 'ipv4'
alb.ingress.kubernetes.io/security-groups: 'sg-xxxxxxxxxxxxxxxxxx'
alb.ingress.kubernetes.io/load-balancer-attributes: 'idle_timeout.timeout_seconds=360'
alb.ingress.kubernetes.io/healthcheck-protocol: HTTP
alb.ingress.kubernetes.io/healthcheck-port: traffic-port
alb.ingress.kubernetes.io/healthcheck-interval-seconds: '15'
alb.ingress.kubernetes.io/healthcheck-timeout-seconds: '5'
alb.ingress.kubernetes.io/success-codes: '200'
alb.ingress.kubernetes.io/healthy-threshold-count: '2'
alb.ingress.kubernetes.io/unhealthy-threshold-count: '2'
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: 'true'
##################################
## postgresql ##
##################################
postgres_configuration_secret: awx-postgres-configuration
# ## Select postresql image and image version
# #
# # postgres_image: quay.io/sclorg/postgresql-15-c9s
# # postgres_image: postgres
# # postgres_image_version: 'latest'
# image_pull_policy: Always
## make postgress db persistent on EFS
## need storage class, filesystem & mount points on all subnets to be pre-configured
postgres_storage_class: efs-postgres-storageclass
postgres_storage_requirements:
requests:
storage: 15Gi
limits:
storage: 35Gi
## EOF
This works perfectly with version 2.10.0, but when trying to deploy from scratch with version 2.16.1, in the logs I see that awx-dev-web is missing and when describing the pod, I get:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 35m default-scheduler Successfully assigned awx/awx-dev-web-94cdf9d45-vkr54 to ip-10-167-0-76.ec2.internal
Normal Pulled 35m kubelet Container image "quay.io/ansible/awx-ee:24.3.1" already present on machine
Normal Created 35m kubelet Created container init
Normal Started 35m kubelet Started container init
Normal Pulled 34m (x5 over 35m) kubelet Container image "quay.io/centos/centos:stream9" already present on machine
Normal Created 34m (x5 over 35m) kubelet Created container init-projects
Normal Started 34m (x5 over 35m) kubelet Started container init-projects
Warning BackOff 43s (x160 over 35m) kubelet Back-off restarting failed container init-projects in pod awx-dev-web-94cdf9d45-vkr54_awx(6c47c5a1-d1b6-4f9b-8b85-8a803da2df2c)
@yyosha should probably look into the logs of the crashing init container
@YaronL16
Pod is in CrashLoopBackOff status
kc logs -f pod/awx-dev-web-567665cb76-hmc5q -c awx-dev-web -n awx
Error from server (BadRequest): container "awx-dev-web" in pod "awx-dev-web-567665cb76-hmc5q" is waiting to start: PodInitializing
@YaronL16
kc logs -f pod/awx-dev-web-567665cb76-hmc5q -c awx-dev-web -n awx Error from server (BadRequest): container "awx-dev-web" in pod "awx-dev-web-567665cb76-hmc5q" is waiting to start: PodInitializing
Get logs from the container after it has failed, or from the previous container (--previous)
@YaronL16
kc logs -f pod/awx-dev-web-567665cb76-hmc5q -c awx-dev-web -n awx --previous
Error from server (BadRequest): previous terminated container "awx-dev-web" in pod "awx-dev-web-567665cb76-hmc5q" not found
From operator logs I get this:
TASK [installer : Get the new resource pod information after updating resource.] ***
task path: /opt/ansible/roles/installer/tasks/resources_configuration.yml:258\nskipping: [localhost] => {\"changed\": false, \"false_condition\": \"this_deployment_result.changed\", \"skip_reason\": \"Conditional result was False\"}\n
TASK [installer : Update new resource pod as a variable.] **********************
task path: /opt/ansible/roles/installer/tasks/resources_configuration.yml:275\nskipping: [localhost] => {\"changed\": false, \"false_condition\": \"this_deployment_result.changed\", \"skip_reason\": \"Conditional result was False\"}\n
TASK [installer : Update new resource pod name as a variable.] *****************
task path: /opt/ansible/roles/installer/tasks/resources_configuration.yml:283\nskipping: [localhost] => {\"changed\": false, \"false_condition\": \"this_deployment_result.changed\", \"skip_reason\": \"Conditional result was False\"}\n
TASK [installer : Verify the resource pod name is populated.] ******************
task path: /opt/ansible/roles/installer/tasks/resources_configuration.yml:289\nfatal: [localhost]: FAILED! => {
\"assertion\": \"awx_web_pod_name != ''\",
\"changed\": false,
\"evaluated_to\": false,
\"msg\": \"Could not find the tower pod's name.\"
}\n
PLAY RECAP *********************************************************************
localhost : ok=69 changed=0 unreachable=0 failed=1 skipped=68 rescued=0 ignored=0 \n","job":"3522416367647485710","name":"awx-dev","namespace":"awx","error":"exit status 2","stacktrace":"github.com/operator-framework/ansible-operator-plugins/internal/ansible/runner.(*runner).Run.func1\n\tansible-operator-plugins/internal/ansible/runner/runner.go:269"}
Again, this work perfectly with version 2.10.0
kc logs -f pod/awx-dev-web-567665cb76-hmc5q -c init-projects -n awx
does that return anything helpful?
@fosterseth I re-deployed ver. 2.16.1 (this is a VERY test env.), hance the different pod name...
kc logs -f pod/awx-dev-web-6b4b544584-mqppn -c init-projects -n awx
Yielded nothing.
But since now I have this
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m6s default-scheduler Successfully assigned awx/awx-dev-web-6b4b544584-mqppn to ip-10-167-0-76.ec2.internal
Normal Pulled 4m6s kubelet Container image "quay.io/ansible/awx-ee:24.3.1" already present on machine
Normal Created 4m6s kubelet Created container init
Normal Started 4m5s kubelet Started container init
Normal Pulled 4m5s kubelet Container image "quay.io/centos/centos:stream9" already present on machine
Normal Created 4m5s kubelet Created container init-projects
Normal Started 4m5s kubelet Started container init-projects
Normal Created 4m4s kubelet Created container redis
Normal Pulled 4m4s kubelet Container image "docker.io/redis:7" already present on machine
Normal Started 4m4s kubelet Started container redis
Normal Pulled 4m4s kubelet Container image "quay.io/ansible/awx:24.3.1" already present on machine
Normal Created 4m4s kubelet Created container awx-dev-rsyslog
Normal Started 4m3s kubelet Started container awx-dev-rsyslog
Normal Created 2m51s (x3 over 4m4s) kubelet Created container awx-dev-web
Normal Started 2m51s (x3 over 4m4s) kubelet Started container awx-dev-web
Warning BackOff 2m11s (x3 over 3m4s) kubelet Back-off restarting failed container awx-dev-web in pod awx-dev-web-6b4b544584-mqppn_awx(e5540567-38f8-4be9-86b3-8602ce7ff7d5)
Normal Pulled 2m (x4 over 4m4s) kubelet Container image "quay.io/ansible/awx:24.3.1" already present on machine
I ran this:
kc logs -f pod/awx-dev-web-6b4b544584-mqppn -c awx-dev-web -n awx
and got this very very long log, which I attached here. awx-operator-2.16.1.txt
Managed to fix all issues. Currect state is:
kubectl get all -n awx | grep awx
pod/awx-migration-24.4.0-rv74w 0/1 Completed 0 3m23s
pod/awx-operator-controller-manager-5b9cb84bd5-g54xx 2/2 Running 0 10m
pod/awx-postgres-15-0 1/1 Running 0 3m54s
pod/awx-task-6f65778bd-wwzld 4/4 Running 0 3m35s
pod/awx-web-988fccf6d-w5pz2 3/3 Running 0 3m36s
service/awx-operator-controller-manager-metrics-service ClusterIP 10.99.254.163
End of the log is a suggested:
PLAY RECAP ***** localhost : ok=90 changed=0 unreachable=0 failed=0 skipped=82 rescued=0 ignored=1
However https://awx.tia.eu doesn't show AWX interface anyway. Any ideas?
@Reign1 Not sure how you set up access to the application on the specified URL, but in your output I did not see an ingress resource. So look into your service discovery.
Please confirm the following
Bug Summary
Following https://ansible.readthedocs.io/projects/awx-operator/en/latest/installation/helm-install-on-existing-cluster.html documentation I've installed awx-operator with helm install. Ended up with these resources:
NAME READY STATUS RESTARTS AGE pod/awx-operator-controller-manager-69d8f784d8-5llkl 2/2 Running 0 12h NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/awx-operator-controller-manager-metrics-service ClusterIP 10.101.89.100 8443/TCP 12h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/awx-operator-controller-manager 1/1 1 1 12h
NAME DESIRED CURRENT READY AGE
replicaset.apps/awx-operator-controller-manager-69d8f784d8 1 1 1 12h
On top of that created awx-demo.yaml:
apiVersion: awx.ansible.com/v1beta1 kind: AWX metadata: name: awx-demo spec: service_type: nodeport
Applied it with "kubectl -n awx apply -f awx-demo.yaml", got output: "awx.awx.ansible.com/awx-demo created".
Still I see no awx-web. Checked the logs "kubectl logs -f awx-operator-controller-manager-69d8f784d8-5llkl -n awx" and see this:
AWX Operator version
2.15
AWX version
24.2.0
Kubernetes platform
kubernetes
Kubernetes/Platform version
1.29.3
Modifications
no
Steps to reproduce
On a fresh k8s cluster (created with kubeadm) I'm trying to setup AWX. As per documentation https://ansible.readthedocs.io/projects/awx-operator/en/latest/installation/helm-install-on-existing-cluster.html I did helm install. That is it.
Expected results
Default AWX setup up and running with fronted exposed to be able to login and try it out.
Actual results
awx-operator deplyed but no awx-web pods running.
Additional information
No response
Operator Logs
kubectl logs -f awx-operator-controller-manager-69d8f784d8-5llkl -n awx:
{"level":"info","ts":"2024-04-17T19:23:30Z","logger":"cmd","msg":"Version","Go Version":"go1.20.12","GOOS":"linux","GOARCH":"amd64","ansible-operator":"v1.34.0","commit":"d26c43bf94960d292152862a6685696be33190fb"} {"level":"info","ts":"2024-04-17T19:23:30Z","logger":"cmd","msg":"Watching namespaces","namespaces":["awx"]} {"level":"info","ts":"2024-04-17T19:23:30Z","logger":"watches","msg":"Environment variable not set; using default value","envVar":"ANSIBLE_VERBOSITY_AWX_AWX_ANSIBLE_COM","default":2} {"level":"info","ts":"2024-04-17T19:23:30Z","logger":"watches","msg":"Environment variable not set; using default value","envVar":"ANSIBLE_VERBOSITY_AWXBACKUP_AWX_ANSIBLE_COM","default":2} {"level":"info","ts":"2024-04-17T19:23:30Z","logger":"watches","msg":"Environment variable not set; using default value","envVar":"ANSIBLE_VERBOSITY_AWXRESTORE_AWX_ANSIBLE_COM","default":2} {"level":"info","ts":"2024-04-17T19:23:30Z","logger":"watches","msg":"Environment variable not set; using default value","envVar":"ANSIBLE_VERBOSITY_AWXMESHINGRESS_AWX_ANSIBLE_COM","default":2} {"level":"info","ts":"2024-04-17T19:23:30Z","logger":"ansible-controller","msg":"Watching resource","Options.Group":"awx.ansible.com","Options.Version":"v1beta1","Options.Kind":"AWX"} {"level":"info","ts":"2024-04-17T19:23:30Z","logger":"ansible-controller","msg":"Watching resource","Options.Group":"awx.ansible.com","Options.Version":"v1beta1","Options.Kind":"AWXBackup"} {"level":"info","ts":"2024-04-17T19:23:30Z","logger":"ansible-controller","msg":"Watching resource","Options.Group":"awx.ansible.com","Options.Version":"v1beta1","Options.Kind":"AWXRestore"} {"level":"info","ts":"2024-04-17T19:23:30Z","logger":"ansible-controller","msg":"Watching resource","Options.Group":"awx.ansible.com","Options.Version":"v1alpha1","Options.Kind":"AWXMeshIngress"} {"level":"info","ts":"2024-04-17T19:23:30Z","logger":"proxy","msg":"Starting to serve","Address":"127.0.0.1:8888"} {"level":"info","ts":"2024-04-17T19:23:30Z","logger":"apiserver","msg":"Starting to serve metrics listener","Address":"localhost:5050"} {"level":"info","ts":"2024-04-17T19:23:30Z","logger":"controller-runtime.metrics","msg":"Starting metrics server"} {"level":"info","ts":"2024-04-17T19:23:30Z","logger":"controller-runtime.metrics","msg":"Serving metrics server","bindAddress":"127.0.0.1:8080","secure":false} {"level":"info","ts":"2024-04-17T19:23:30Z","msg":"starting server","kind":"health probe","addr":"[::]:6789"} I0417 19:23:30.391565 2 leaderelection.go:250] attempting to acquire leader lease awx/awx-operator... E0417 19:24:00.393847 2 leaderelection.go:332] error retrieving resource lock awx/awx-operator: Get "https://10.96.0.1:443/apis/coordination.k8s.io/v1/namespaces/awx/leases/awx-operator": dial tcp 10.96.0.1:443: i/o timeout ...