Closed davidmacdonald80 closed 2 weeks ago
@davidmacdonald80 Hi, I'm a bit confused about your wording. There are lot of similar words that used in mixed and inconsistent, e.g. server node, worker node, node, agend, host, etc. I could not read exactly what composition you were trying to create in the end.
Please list the servers that appear in your environment and make it a little clearer what is installed on each and what role you are trying to give each one. Are you trying to cluster K3s with two servers, a server and an agent, and run AWX on one and Pod for Job on the other? And you tried to use the K3s agent host for running the Job Pod as an Instance Group as an Instance in AWX, but it doesn't work properly?
Apologies for my inconsistency. I'm still getting used to the wording in the k3s environment.
Are you trying to cluster K3s with two servers, a server and an agent, and run AWX on one and Pod for Job on the other? And you tried to use the K3s agent host for running the Job Pod as an Instance Group as an Instance in AWX, but it doesn't work properly?
I believe this is accurate in what I was trying to get across.
I have a server named AWX.localhost that is the k3s server hosting AWX.
I have a seperate computer named Ansible.localhost that I would like to install k3s agent to be used as a worker node for AWX.
I can get k3s server and agent on seperate computers working to the point that I can see both nodes doing kubectl get nodes
, I just can't figure out what I'm missing to be able to run a health check in AWX web interface to verify new instance/worker node.
Thank you for confirming. You have two options. It seems that you might be confused between Instance Group and Container Group.
The option 1 is probably simpler. In this case, remove K3s from your Execution Node, stop receptor and delete /etc/receptor
, then delete and add again the Instance in AWX, download the Install Bundle, and execute it on the Execution Node.
It sounds so simple when you put it like that. I will test tonight, thank you!
Ok, It is probably my fault for overlooking something, but I am still not getting it to work. Ansible.localhost is a kvm guest. I restored to a snapshot before k3s was installed. The only thing I did was run the bundle from the server on it. I created an instance with the hostname ansible, I created an instance group. I left everything at 0. On the instance, there is an option to run a health-check but it never succeeds for me. Ignoring that, I tried running a template and it just sits at pending.
The receptor.service on Ansible is complaining about no back-ends and won't stay running. I've added back-ends manually before and got the receptor service stable but I was just taking a shot in the dark on what might be needed from what I could find the docs.
# receptor --config receptor.conf
INFO 2024/06/20 18:44:16 Running control service control
WARNING 2024/06/20 18:44:16 Nothing to do - no backends are running.
Run receptor --help for command line instructions.
So I checked the receptor.conf on ansible:
---
- node:
id: ansible
- work-verification:
publickey: /etc/receptor/work_public_key.pem
- log-level: info
- control-service:
service: control
filename: /var/run/receptor/receptor.sock
permissions: 0660
tls: tls_server
- tls-server:
name: tls_server
cert: /etc/receptor/tls/ansible.crt
key: /etc/receptor/tls/ansible.key
clientcas: /etc/receptor/tls/ca/mesh-CA.crt
requireclientcert: true
mintls13: False
- tls-client:
name: tls_client
cert: /etc/receptor/tls/ansible.crt
key: /etc/receptor/tls/ansible.key
rootcas: /etc/receptor/tls/ca/mesh-CA.crt
insecureskipverify: false
mintls13: False
- work-command:
worktype: ansible-runner
command: ansible-runner
params: worker
allowruntimeparams: True
verifysignature: True
This is where I'm not really sure where to check in AWX to find the receptor.conf or if I even need to worry about it. I took a shot in the dark and tried
kubectl -n awx exec -it pod/awx-task-66b58fbcc9-rdswb -- /bin/bash
bash-5.1$ cat /etc/receptor/receptor.conf
---
- log-level: info
- local-only: null
- node:
firewallrules:
- action: reject
tonode: awx-task-66b58fbcc9-rdswb
toservice: control
- control-service:
service: control
filename: /var/run/receptor/receptor.sock
permissions: '0660'
- work-command:
worktype: local
command: ansible-runner
params: worker
allowruntimeparams: true
- work-kubernetes:
worktype: kubernetes-runtime-auth
authmethod: runtime
allowruntimeauth: true
allowruntimepod: true
allowruntimeparams: true
- work-kubernetes:
worktype: kubernetes-incluster-auth
authmethod: incluster
allowruntimeauth: true
allowruntimepod: true
allowruntimeparams: true
- tls-client:
cert: /etc/receptor/tls/receptor.crt
key: /etc/receptor/tls/receptor.key
name: tlsclient
rootcas: /etc/receptor/tls/ca/mesh-CA.crt
mintls13: false
- work-signing:
privatekey: /etc/receptor/work_private_key.pem
tokenexpiration: 1m
We are definitely making progress. You are simply a little short of setting up the instance. Right now, your Execution Node is not configured to peer with AWX.
See this my comment on the forum, and follow step 2 through 8: https://forum.ansible.com/t/creation-of-awx-execution-node-failed/2585/9?u=kurokobo
Please be careful on step 5. If you enter a hostname for the Name field, this must be capable of name resolution by DNS from AWX; if there is no entry in DNS, try specifying the IP address instead. Also, don't forget to check Peers from control node
. This makes your Execution Node to be peered with control node.
After following step 2 through 8, if you are still facing the issue, please follow the troubleshooting steps (1 through 5) in the second half of the screenshot above.
That did it! Thank you for pointing me in the right direction!
Environment
Description
I'm aware you may not respond and you're welcome to just close this if you want. I've searched a great deal here and else where without luck so far. I'm trying to figure out if it is possible to use AWX on k3s on a server node and also have a worker node on an actual separate node running as an agent to help limit resource usage. I can get k3s server and agent communicating from separate hosts, but I still run into issues getting AWX to run a health check on a new instance named with the hostname of the agent. I am assuming at the moment that it is an issue getting receptor running within AWX and on the agent? This is where i'm still confused as to how it is supposed to work. I've gotten receptor to run without errors on the host of the k3s agent but it didn't make a different from within AWX.
Step to Reproduce
Assuming AWX is running. From the would be agent 'curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.30.0+k3s1 K3S_URL=https://<-Server-hostname->:6443 K3S_TOKEN="<-MyToken->" sh -' On the agent still, Ive modified the /etc/receptor/receptor.conf to specify the tls-server tls-client worker-command and a few other things. I'm still grey on what this conf should contain or if it even matters since AWX never recognizes the agent if receptor service is stable on the host of k3s agent. If it is even possible, I think I'm missing a set of steps somewhere.
Logs
I'd give you logs if I had anything that pointed to anything. Maybe I just haven't looked at the correct log yet?
Files
I'll provide anything you might need, but I'm not sure it applies here yet.