kurokobo / awx-on-k3s

An example implementation of AWX on single node K3s using AWX Operator, with easy-to-use simplified configuration with ownership of data and passwords.
MIT License
518 stars 143 forks source link

Possibility of execution instance(worker node) on another node/host/agent #374

Closed davidmacdonald80 closed 2 weeks ago

davidmacdonald80 commented 2 weeks ago

Environment

Description

I'm aware you may not respond and you're welcome to just close this if you want. I've searched a great deal here and else where without luck so far. I'm trying to figure out if it is possible to use AWX on k3s on a server node and also have a worker node on an actual separate node running as an agent to help limit resource usage. I can get k3s server and agent communicating from separate hosts, but I still run into issues getting AWX to run a health check on a new instance named with the hostname of the agent. I am assuming at the moment that it is an issue getting receptor running within AWX and on the agent? This is where i'm still confused as to how it is supposed to work. I've gotten receptor to run without errors on the host of the k3s agent but it didn't make a different from within AWX.

Step to Reproduce

Assuming AWX is running. From the would be agent 'curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.30.0+k3s1 K3S_URL=https://<-Server-hostname->:6443 K3S_TOKEN="<-MyToken->" sh -' On the agent still, Ive modified the /etc/receptor/receptor.conf to specify the tls-server tls-client worker-command and a few other things. I'm still grey on what this conf should contain or if it even matters since AWX never recognizes the agent if receptor service is stable on the host of k3s agent. If it is even possible, I think I'm missing a set of steps somewhere.

Logs

I'd give you logs if I had anything that pointed to anything. Maybe I just haven't looked at the correct log yet?

Files

I'll provide anything you might need, but I'm not sure it applies here yet.

kurokobo commented 2 weeks ago

@davidmacdonald80 Hi, I'm a bit confused about your wording. There are lot of similar words that used in mixed and inconsistent, e.g. server node, worker node, node, agend, host, etc. I could not read exactly what composition you were trying to create in the end.

Please list the servers that appear in your environment and make it a little clearer what is installed on each and what role you are trying to give each one. Are you trying to cluster K3s with two servers, a server and an agent, and run AWX on one and Pod for Job on the other? And you tried to use the K3s agent host for running the Job Pod as an Instance Group as an Instance in AWX, but it doesn't work properly?

davidmacdonald80 commented 2 weeks ago

Apologies for my inconsistency. I'm still getting used to the wording in the k3s environment.

Are you trying to cluster K3s with two servers, a server and an agent, and run AWX on one and Pod for Job on the other? And you tried to use the K3s agent host for running the Job Pod as an Instance Group as an Instance in AWX, but it doesn't work properly?

I believe this is accurate in what I was trying to get across.

I have a server named AWX.localhost that is the k3s server hosting AWX.
I have a seperate computer named Ansible.localhost that I would like to install k3s agent to be used as a worker node for AWX.

I can get k3s server and agent on seperate computers working to the point that I can see both nodes doing kubectl get nodes, I just can't figure out what I'm missing to be able to run a health check in AWX web interface to verify new instance/worker node.

kurokobo commented 2 weeks ago

Thank you for confirming. You have two options. It seems that you might be confused between Instance Group and Container Group.

The option 1 is probably simpler. In this case, remove K3s from your Execution Node, stop receptor and delete /etc/receptor, then delete and add again the Instance in AWX, download the Install Bundle, and execute it on the Execution Node.

davidmacdonald80 commented 2 weeks ago

It sounds so simple when you put it like that. I will test tonight, thank you!

davidmacdonald80 commented 2 weeks ago

Ok, It is probably my fault for overlooking something, but I am still not getting it to work. Ansible.localhost is a kvm guest. I restored to a snapshot before k3s was installed. The only thing I did was run the bundle from the server on it. I created an instance with the hostname ansible, I created an instance group. I left everything at 0. On the instance, there is an option to run a health-check but it never succeeds for me. Ignoring that, I tried running a template and it just sits at pending.

The receptor.service on Ansible is complaining about no back-ends and won't stay running. I've added back-ends manually before and got the receptor service stable but I was just taking a shot in the dark on what might be needed from what I could find the docs.

# receptor --config receptor.conf
INFO 2024/06/20 18:44:16 Running control service control
WARNING 2024/06/20 18:44:16 Nothing to do - no backends are running.
Run receptor --help for command line instructions.

So I checked the receptor.conf on ansible:

---
- node:
    id: ansible

- work-verification:
    publickey: /etc/receptor/work_public_key.pem

- log-level: info

- control-service:
    service: control
    filename: /var/run/receptor/receptor.sock
    permissions: 0660
    tls: tls_server
- tls-server:
    name: tls_server
    cert: /etc/receptor/tls/ansible.crt
    key: /etc/receptor/tls/ansible.key
    clientcas: /etc/receptor/tls/ca/mesh-CA.crt
    requireclientcert: true
    mintls13: False

- tls-client:
    name: tls_client
    cert: /etc/receptor/tls/ansible.crt
    key: /etc/receptor/tls/ansible.key
    rootcas: /etc/receptor/tls/ca/mesh-CA.crt
    insecureskipverify: false
    mintls13: False

- work-command:
    worktype: ansible-runner
    command: ansible-runner
    params: worker
    allowruntimeparams: True
    verifysignature: True

This is where I'm not really sure where to check in AWX to find the receptor.conf or if I even need to worry about it. I took a shot in the dark and tried

kubectl -n awx exec -it pod/awx-task-66b58fbcc9-rdswb -- /bin/bash
bash-5.1$ cat /etc/receptor/receptor.conf
---
- log-level: info
- local-only: null
- node:
   firewallrules:
    - action: reject
      tonode: awx-task-66b58fbcc9-rdswb
      toservice: control
- control-service:
    service: control
    filename: /var/run/receptor/receptor.sock
    permissions: '0660'
- work-command:
    worktype: local
    command: ansible-runner
    params: worker
    allowruntimeparams: true
- work-kubernetes:
    worktype: kubernetes-runtime-auth
    authmethod: runtime
    allowruntimeauth: true
    allowruntimepod: true
    allowruntimeparams: true
- work-kubernetes:
    worktype: kubernetes-incluster-auth
    authmethod: incluster
    allowruntimeauth: true
    allowruntimepod: true
    allowruntimeparams: true
- tls-client:
    cert: /etc/receptor/tls/receptor.crt
    key: /etc/receptor/tls/receptor.key
    name: tlsclient
    rootcas: /etc/receptor/tls/ca/mesh-CA.crt
    mintls13: false
- work-signing:
    privatekey: /etc/receptor/work_private_key.pem
    tokenexpiration: 1m
kurokobo commented 2 weeks ago

We are definitely making progress. You are simply a little short of setting up the instance. Right now, your Execution Node is not configured to peer with AWX.

See this my comment on the forum, and follow step 2 through 8: https://forum.ansible.com/t/creation-of-awx-execution-node-failed/2585/9?u=kurokobo

image

Please be careful on step 5. If you enter a hostname for the Name field, this must be capable of name resolution by DNS from AWX; if there is no entry in DNS, try specifying the IP address instead. Also, don't forget to check Peers from control node. This makes your Execution Node to be peered with control node.

After following step 2 through 8, if you are still facing the issue, please follow the troubleshooting steps (1 through 5) in the second half of the screenshot above.

davidmacdonald80 commented 2 weeks ago

That did it! Thank you for pointing me in the right direction!