Checkmk / ansible-collection-checkmk.general

The official Checkmk Ansible collection - brought to you by the Checkmk company.
https://galaxy.ansible.com/checkmk/general
GNU General Public License v3.0
120 stars 55 forks source link

[BUG] Host does not get added to Available Agent Configurations in agent bakery; does not get automated configurations #282

Closed timatlee closed 1 year ago

timatlee commented 1 year ago

Describe the bug

I don't believe that hosts are getting added to the list of agents that an agent configuration can be applied to, unless the host has been created first.

I'm open to the idea that I just don't know what I'm doing, and this is expected behaviour.

Component Name

agent role - activation ?

Ansible Version

ansible [core 2.14.1]
  config file = /home/timatlee/Home_Ansible/ansible.cfg
  configured module search path = ['/home/timatlee/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/timatlee/.local/lib/python3.9/site-packages/ansible
  ansible collection location = /home/timatlee/Home_Ansible/ansible_collections
  executable location = /home/timatlee/.local/bin/ansible
  python version = 3.9.2 (default, Feb 28 2021, 17:03:44) [GCC 10.2.1 20210110] (/usr/bin/python3)
  jinja version = 3.1.2
  libyaml = True

Checkmk Version

Checkmk Free Edition 2.1.0p21

Collection Version

$ ansible-galaxy collection list
Collection                    Version
----------------------------- -------
amazon.aws                    5.1.0  
ansible.netcommon             4.1.0  
ansible.posix                 1.4.0  
ansible.utils                 2.8.0  
ansible.windows               1.12.0 
arista.eos                    6.0.0  
awx.awx                       21.10.0
azure.azcollection            1.14.0 
check_point.mgmt              4.0.0  
chocolatey.chocolatey         1.3.1  
cisco.aci                     2.3.0  
cisco.asa                     4.0.0  
cisco.dnac                    6.6.1  
cisco.intersight              1.0.22 
cisco.ios                     4.0.0  
cisco.iosxr                   4.0.3  
cisco.ise                     2.5.9  
cisco.meraki                  2.13.0 
cisco.mso                     2.1.0  
cisco.nso                     1.0.3  
cisco.nxos                    4.0.1  
cisco.ucs                     1.8.0  
cloud.common                  2.1.2  
cloudscale_ch.cloud           2.2.3  
community.aws                 5.0.0  
community.azure               2.0.0  
community.ciscosmb            1.0.5  
community.crypto              2.9.0  
community.digitalocean        1.22.0 
community.dns                 2.4.2  
community.docker              3.3.1  
community.fortios             1.0.0  
community.general             6.1.0  
community.google              1.0.0  
community.grafana             1.5.3  
community.hashi_vault         4.0.0  
community.hrobot              1.6.0  
community.libvirt             1.2.0  
community.mongodb             1.4.2  
community.mysql               3.5.1  
community.network             5.0.0  
community.okd                 2.2.0  
community.postgresql          2.3.1  
community.proxysql            1.4.0  
community.rabbitmq            1.2.3  
community.routeros            2.5.0  
community.sap                 1.0.0  
community.sap_libs            1.4.0  
community.skydive             1.0.0  
community.sops                1.5.0  
community.vmware              3.2.0  
community.windows             1.11.1 
community.zabbix              1.9.0  
containers.podman             1.10.1 
cyberark.conjur               1.2.0  
cyberark.pas                  1.0.14 
dellemc.enterprise_sonic      2.0.0  
dellemc.openmanage            6.3.0  
dellemc.os10                  1.1.1  
dellemc.os6                   1.0.7  
dellemc.os9                   1.0.4  
f5networks.f5_modules         1.21.0 
fortinet.fortimanager         2.1.7  
fortinet.fortios              2.2.1  
frr.frr                       2.0.0  
gluster.gluster               1.0.2  
google.cloud                  1.0.2  
grafana.grafana               1.1.0  
hetzner.hcloud                1.9.0  
hpe.nimble                    1.1.4  
ibm.qradar                    2.1.0  
ibm.spectrum_virtualize       1.10.0 
infinidat.infinibox           1.3.12 
infoblox.nios_modules         1.4.1  
inspur.ispim                  1.2.0  
inspur.sm                     2.3.0  
junipernetworks.junos         4.1.0  
kubernetes.core               2.3.2  
lowlydba.sqlserver            1.2.1  
mellanox.onyx                 1.0.0  
netapp.aws                    21.7.0 
netapp.azure                  21.10.0
netapp.cloudmanager           21.21.0
netapp.elementsw              21.7.0 
netapp.ontap                  22.0.1 
netapp.storagegrid            21.11.1
netapp.um_info                21.8.0 
netapp_eseries.santricity     1.3.1  
netbox.netbox                 3.9.0  
ngine_io.cloudstack           2.3.0  
ngine_io.exoscale             1.0.0  
ngine_io.vultr                1.1.2  
openstack.cloud               1.10.0 
openvswitch.openvswitch       2.1.0  
ovirt.ovirt                   2.4.1  
purestorage.flasharray        1.15.0 
purestorage.flashblade        1.10.0 
purestorage.fusion            1.2.0  
sensu.sensu_go                1.13.1 
splunk.es                     2.1.0  
t_systems_mms.icinga_director 1.31.4 
theforeman.foreman            3.7.0  
vmware.vmware_rest            2.2.0  
vultr.cloud                   1.3.1  
vyos.vyos                     4.0.0  
wti.remote                    1.0.4  

# /home/timatlee/Home_Ansible/ansible_collections
Collection          Version
------------------- -------
ansible.posix       1.5.1  
ansible.utils       2.9.0  
ansible.windows     1.13.0 
community.docker    3.4.1  
community.general   6.3.0  
community.windows   1.12.0 
kubernetes.core     2.4.0  
sbaerlocher.windows 0.0.9  
tribe29.checkmk     0.17.1 

Environment

Debian Linux 5.15.85-1-pve #1 SMP PVE 5.15.85-1 (2023-02-01T00:00Z) x86_64 GNU/Linux

Python 3.9.2

To Reproduce

To re-create, I can:

  1. Starting with an empty list of hosts in Agent Configuration, and no hosts registered on the CMK site. Agent has been purged from the monitored host.
  2. Run my playbook. Output shows that the GENERIC agent was downloaded and installed:
TASK [tribe29.checkmk.agent : Run OS Family specific Tasks.] ***************************************************************************************************************
included: /home/timatlee/Home_Ansible/ansible_collections/tribe29/checkmk/roles/agent/tasks/Debian.yml for ansible

TASK [tribe29.checkmk.agent : Debian Derivatives: Download host-specific Checkmk CFE Agent.] *******************************************************************************
ok: [ansible]

TASK [tribe29.checkmk.agent : Debian Derivates: Transfer host-specific Checkmk CFE Agent.] *********************************************************************************
skipping: [ansible]

TASK [tribe29.checkmk.agent : Debian Derivatives: Install host-specific Checkmk CFE Agent.] ********************************************************************************
skipping: [ansible]

TASK [tribe29.checkmk.agent : Debian Derivatives: Download GENERIC Checkmk CFE Agent.] *************************************************************************************
changed: [ansible]

TASK [tribe29.checkmk.agent : Debian Derivates: Transfer GENERIC Checkmk CFE Agent.] ***************************************************************************************
skipping: [ansible]

TASK [tribe29.checkmk.agent : Debian Derivatives: Install GENERIC Checkmk CFE Agent.] **************************************************************************************
changed: [ansible]

TASK [tribe29.checkmk.agent : Debian Derivatives: Install Checkmk CRE Agent.] **********************************************************************************************
skipping: [ansible]

TASK [tribe29.checkmk.agent : Create host on server.] **********************************************************************************************************************
[DEPRECATION WARNING]: Alias 'host_name' is deprecated. See the module docs for more information. This feature will be removed from tribe29.checkmk in a release after 
2024-01-01. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
changed: [ansible -> localhost]

TASK [tribe29.checkmk.agent : Check for Agent Updater Binary.] *************************************************************************************************************
ok: [ansible]

TASK [tribe29.checkmk.agent : Check for Agent Controller Binary.] **********************************************************************************************************
ok: [ansible]

TASK [tribe29.checkmk.agent : Register Agent for automatic Upates using User Password.] ************************************************************************************
skipping: [ansible]

TASK [tribe29.checkmk.agent : Register Agent for automatic Upates using Automation Secret.] ********************************************************************************
skipping: [ansible]

TASK [tribe29.checkmk.agent : Trigger Activate Changes to enable TLS registration.] ****************************************************************************************

RUNNING HANDLER [tribe29.checkmk.agent : Activate Changes.] ****************************************************************************************************************
changed: [ansible -> localhost]

TASK [tribe29.checkmk.agent : Register Agent for TLS.] *********************************************************************************************************************
changed: [ansible]

TASK [tribe29.checkmk.agent : Discover services and labels on host.] *******************************************************************************************************
changed: [ansible -> localhost]

RUNNING HANDLER [tribe29.checkmk.agent : Activate Changes.] ****************************************************************************************************************
changed: [ansible -> localhost]

PLAY RECAP *****************************************************************************************************************************************************************
ansible                    : ok=16   changed=7    unreachable=0    failed=0    skipped=9    rescued=0    ignored=0   

Likewise, the Agent configuration screen does not show the newly added host: image

  1. Repeating step 1 (deleted host from CMK UI, purged agent from monitored host, etc...)

  2. Adding monitored host to the appropriate folder, and activate changes. The list of hosts in the Agent configuration screen shows the newly added host: image

  3. Execute playbook to install agent and register. I see the host-specific agent installed:

    
    TASK [tribe29.checkmk.agent : Run OS Family specific Tasks.] ***************************************************************************************************************
    included: /home/timatlee/Home_Ansible/ansible_collections/tribe29/checkmk/roles/agent/tasks/Debian.yml for ansible

TASK [tribe29.checkmk.agent : Debian Derivatives: Download host-specific Checkmk CFE Agent.] *** ok: [ansible]

TASK [tribe29.checkmk.agent : Debian Derivates: Transfer host-specific Checkmk CFE Agent.] ***** skipping: [ansible]

TASK [tribe29.checkmk.agent : Debian Derivatives: Install host-specific Checkmk CFE Agent.] **** changed: [ansible]

TASK [tribe29.checkmk.agent : Debian Derivatives: Download GENERIC Checkmk CFE Agent.] ***** skipping: [ansible]

TASK [tribe29.checkmk.agent : Debian Derivates: Transfer GENERIC Checkmk CFE Agent.] *** skipping: [ansible]

TASK [tribe29.checkmk.agent : Debian Derivatives: Install GENERIC Checkmk CFE Agent.] ** skipping: [ansible]

TASK [tribe29.checkmk.agent : Debian Derivatives: Install Checkmk CRE Agent.] ** skipping: [ansible]

TASK [tribe29.checkmk.agent : Create host on server.] ** [DEPRECATION WARNING]: Alias 'host_name' is deprecated. See the module docs for more information. This feature will be removed from tribe29.checkmk in a release after 2024-01-01. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. ok: [ansible -> localhost]

TASK [tribe29.checkmk.agent : Check for Agent Updater Binary.] ***** ok: [ansible]

TASK [tribe29.checkmk.agent : Check for Agent Controller Binary.] ** ok: [ansible]

TASK [tribe29.checkmk.agent : Register Agent for automatic Upates using User Password.] **** skipping: [ansible]

TASK [tribe29.checkmk.agent : Register Agent for automatic Upates using Automation Secret.] **** changed: [ansible]

TASK [tribe29.checkmk.agent : Trigger Activate Changes to enable TLS registration.] ****

TASK [tribe29.checkmk.agent : Register Agent for TLS.] ***** changed: [ansible]

TASK [tribe29.checkmk.agent : Discover services and labels on host.] *** changed: [ansible -> localhost]

RUNNING HANDLER [tribe29.checkmk.agent : Activate Changes.] **** changed: [ansible -> localhost]

PLAY RECAP ***** ansible : ok=15 changed=5 unreachable=0 failed=0 skipped=9 rescued=0 ignored=0


Given enough time, I eventually see the agent plugins be installed (but is pending a service scan and activation).

**Expected behavior**

Agent installation and activation should pick up agent bakery rules, if they apply to the host, without needed to have the host created in the UI first.

**Actual behavior**

Agent installation, activation and registration does not pick up the bakery rules unless the host has already been created prior to execution of the role

**Screenshots**

Seen throughout, I hope.

**Additional context**
Playbook:
```yaml
---
- name: Test playbook for CMK agent
  hosts: ansible

  vars:
    - checkmk_agent_edition: cfe
    - checkmk_agent_protocol: https
    - checkmk_agent_server: checkmk.home.timatlee.com
    - checkmk_agent_server_validate_certs: true
    - checkmk_agent_port: "{% if checkmk_agent_protocol == 'https' %}443{% else %}80{% endif %}"
    - checkmk_agent_site: monitoring
    - checkmk_agent_user: automation
    - checkmk_agent_secret: "{{ checkmk_agent_pass }}"
    - checkmk_agent_auto_activate: true
    - checkmk_agent_add_host: true
    - checkmk_agent_discover: true
    - checkmk_agent_update: true
    - checkmk_agent_tls: true

  vars_files:
    - ../vars/secrets.yml

  tasks:
    - name: Set the target folder based on virtualization_role
      ansible.builtin.set_fact:
        checkmk_agent_folder: "
          {%- if ansible_facts.virtualization_role == 'host' -%}
            /hosts/physical/
          {%- elif ansible_facts.virtualization_role == 'guest' -%}
            /hosts/virtual/
          {%- else -%}
            /hosts/
          {% endif %}"

    - name: Install the CMK agent
      ansible.builtin.include_role:
        name: tribe29.checkmk.agent
      when: ansible_facts['os_family'] == "Debian"
robin-checkmk commented 1 year ago

Hi @timatlee and thanks for your extensive report.

If I got everything correctly, I think there is a misunderstanding at play here. The agent role does not automatically bake agents. When you install the agent and add the host to the monitoring, you have to run "Bake and sign agents" yourself through the UI. Only then will the host show up in the agent configurations overview.

timatlee commented 1 year ago

Yeah, I don't think I expected the role to bake the agent, but I was trying to figure how get the role to install the agent, then get updates from the bakery.

  1. Purge the package from my target machine, and remove /var/lib/check_mk_agent/, /var/lib/cmk-agent/ and /etc/check_mk
  2. Remove the host from CMK and find that it's not listed on the Agent configuration screen
  3. Run the playbook. I see the generic agent gets installed. I see the host has a bunch of services inventoried.
  4. Bake and sign agents in the UI
  5. Hosts show up in the agent configuration window

After a long while, I'm not seeing the installed agent pick up configured plugins. Checking the Automatic Updates window in CMK, it tells me that no agents are registered.

Do I need to follow up with a shell call to cmk-update-agent register ?

I suppose I'm a bit confused, because reading the manual, I'm supposed to download the baked agent and install that .. but I don't believe I understand how to configure the role to do that.

Thanks!

robin-checkmk commented 1 year ago

Generally, no purging is necessary, as the role is idempotent for the most part. The log shows, that the registration for updates is successful, so your host should show up as registered in Checkmk. The generic agent is used for installation, as at that time, no host specific agent is available. After a while the updater should pull the correct agent.

There should be no need for manual intervention.

What does the Check_MK Agent say, after the role has run?

github-actions[bot] commented 1 year ago

This issue has been stale for 60 days. It will close in 7 days.