dj-wasabi / ansible-telegraf

Installing and configuring Telegraf via Ansible for RedHat/Debian/Ubuntu/Windows/Suse.
MIT License
134 stars 116 forks source link

Failing on MacOS (Catalina) as control machine #115

Open bengoa opened 4 years ago

bengoa commented 4 years ago

Describe the bug

When running this role from a MacOS (Catalina) it is failing with this error:

fatal: [hostname]: FAILED! => {"msg": "The conditional check 'telegraf_agent_version is version('0.10.0', '<')' failed. The error was: Version comparison: '<' not supported between instances of 'str' and 'int'\n\nThe error appears to be in '/path/to/ansible/roles/external/telegraf/tasks/configurelinux.yml': line 24, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: \"Copy the template for versions < 0.10.0\"\n ^ here\n"}

Installation method/version

Ansible Version

ansible 2.9.4 config file = /Users/alberto/src/ansible.cfg configured module search path = ['/Users/alberto/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules'] ansible python module location = /usr/local/Cellar/ansible/2.9.4/libexec/lib/python3.8/site-packages/ansible executable location = /usr/local/bin/ansible python version = 3.8.1 (default, Dec 27 2019, 18:06:00) [Clang 11.0.0 (clang-1100.0.33.16)]

_

Targetted hosts Concerns the following OS(es):

Expected behavior

Shouldn't be failing

Additional context

It is failing with Ansible 2.8 @ Catalina as well. I'm not 100% positive, but I guess it may be related to the Python version (3.x) instead of Ansible itself.

I have another machine running Linux and Ansible 2.8/2.9 and it works fine with Python 2.7.

dj-wasabi commented 4 years ago

Hi,

I have merged PR #135 that would fix this issue. Can you please check?

Kind regards, Werner

bengoa commented 4 years ago

Hi Werner.

Not yet I'm afraid.

TASK [external/telegraf : Copy the template for versions < 0.10.0] *********************
fatal: [hostname]: FAILED! => {"msg": "The conditional check 'telegraf_agent_version is version('0.10.0', '<')' failed. The error was: Version comparison: '<' not supported between instances of 'str' and 'int'\n\nThe error appears to be in '/path/to/ansible/roles/external/telegraf/tasks/configure_linux.yml': line 24, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: \"Copy the template for versions < 0.10.0\"\n  ^ here\n"}

I'm testing against master branch code. Should I try another one?

Cheers, Alberto

dj-wasabi commented 4 years ago

Hi @bengoa

Can you provide that version you have configured with the property telegraf_agent_version ? It may not contain latest.

Kind regards, Werner

bengoa commented 4 years ago

Hi Werner.

I have been using '*' as my telegraf_agent_version to avoid unwanted upgrades:

telegraf_agent_version: '*'

Maybe there's a better way to do that?

Cheers, Alberto

Hi @bengoa

Can you provide that version you have configured with the property telegraf_agent_version ? It may not contain latest.

Kind regards, Werner

dj-wasabi commented 4 years ago

Hi @bengoa

The property needs a valid version if you want to not upgrade. The telegraf_agent_package_state is already set to present so once telegraf is installed, it isn't updated anymore.

Kind regards, Werner

bengoa commented 4 years ago

Hi Werner.

Let me give it a try.

I remember in the past (2 years ago) of having packages being upgraded/downgraded to the version set to telegraf_agent_version.

Cheers. Alberto

Hi @bengoa

The property needs a valid version if you want to not upgrade. The telegraf_agent_package_state is already set to present so once telegraf is installed, it isn't updated anymore.

Kind regards, Werner

bengoa commented 4 years ago

Hi Werner.

I made a test here setting the parameters like this:

telegraf_agent_version: 1.10.0
telegraf_agent_package_state: present

If I ran against a server with version 1.15.3-1 installed, it tries to downgrade to 1.10.0:

TASK [external/telegraf : Debian | Install Telegraf package] ***************************
FAILED - RETRYING: Debian | Install Telegraf package (3 retries left).
FAILED - RETRYING: Debian | Install Telegraf package (2 retries left).
FAILED - RETRYING: Debian | Install Telegraf package (1 retries left).
fatal: [hostname]: FAILED! => {"attempts": 3, "cache_update_time": 1603122036, "cache_updated": false, "changed": false, "msg": "'/usr/bin/apt-get -y -o \"Dpkg::Options::=--force-confdef\" -o \"Dpkg::Options::=--force-confold\"      install 'telegraf=1.10.0-1'' failed: E: Version '1.10.0-1' for 'telegraf' was not found\n", "rc": 100, "stderr": "E: Version '1.10.0-1' for 'telegraf' was not found\n", "stderr_lines": ["E: Version '1.10.0-1' for 'telegraf' was not found"], "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\n", "stdout_lines": ["Reading package lists...", "Building dependency tree...", "Reading state information..."]}

I have a heterogeneous environment here, running CentOSes from 6 to 8 and Debians 9 and 10. Using '*' doesn't enforce the downgrade/upgrade to the version set at telegraf_agent_version and I'm able to keep a single vars files for all these operating systems versions ensuring a consistent deploy.

Maybe telegraf_agent_package_state is being ignored?

Cheers, Alberto

dj-wasabi commented 4 years ago

Hi Alberto,

Ah yes, it seems that the version is part of the package name and thus it sees that it isn't installed. Not really sure yet on how this should be fixed.

Kind regards, Werner