threatstack / threatstack-ansible

Ansible for installing Threatstack Agent
https://www.threatstack.com
MIT License
23 stars 17 forks source link

Using --check on first-run fails hard #69

Open patrickjahns opened 4 years ago

patrickjahns commented 4 years ago

Description

When using the role and running it with a --check in the first run, it fails with:

TASK [threatstack.threatstack-ansible : Ensure ThreatStack is installed] *******
fatal: [redacted]: FAILED! => {"cache_update_time": 1588692391, "cache_updated": false, "changed": false, "msg": "'/usr/bin/apt-get -y -o \"Dpkg::Options::=--force-confdef\" -o \"Dpkg::Options::=--force-confold\"     --simulate install 'threatstack-agent=2*'' failed: E: Version '2*' for 'threatstack-agent' was not found\n", "rc": 100, "stderr": "E: Version '2*' for 'threatstack-agent' was not found\n", "stderr_lines": ["E: Version '2*' for 'threatstack-agent' was not found"], "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\n", "stdout_lines": ["Reading package lists...", "Building dependency tree...", "Reading state information..."]}
fatal: [redacted]: FAILED! => {"cache_update_time": 1588692392, "cache_updated": false, "changed": false, "msg": "'/usr/bin/apt-get -y -o \"Dpkg::Options::=--force-confdef\" -o \"Dpkg::Options::=--force-confold\"     --simulate install 'threatstack-agent=2*'' failed: E: Version '2*' for 'threatstack-agent' was not found\n", "rc": 100, "stderr": "E: Version '2*' for 'threatstack-agent' was not found\n", "stderr_lines": ["E: Version '2*' for 'threatstack-agent' was not found"], "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\n", "stdout_lines": ["Reading package lists...", "Building dependency tree...", "Reading state information..."]}

Expectation

I want to able to run the role in --check mode first in order to see the changes, before I will apply them. I expect no failure to occure

olhado commented 3 years ago

Hello @patrickjahns,

Sorry for the delayed response. I've tried running this locally, and I don't get this error.

TASK [threatstack-ansible : Ensure ThreatStack is installed] *********************************************************************************************************************************************
task path: /etc/ansible/roles/threatstack-ansible/tasks/apt_install.yml:30
ok: [localhost] => {"cache_update_time": 1605625886, "cache_updated": false, "changed": false}

What linux distribution and version of this role are you running with? Are you still seeing this error?

olhado commented 3 years ago

I take that back. I reworked my test environment, and can replicate an error on the same task. I don't get the full output though. Will look into this more.

olhado commented 3 years ago

So the error I encountered is a little different, @patrickjahns. In a clean environment, if I run just the role with --check, I get an error that it can't find the package itself, not the version. I ran with -vvv. I ran on Ubuntu 20.04, agent 2.3.0, and ansible 2.10.3.

TASK [threatstack-ansible : Ensure ThreatStack is installed] *********************************************************************************************************************************************
task path: /etc/ansible/roles/threatstack-ansible/tasks/apt_install.yml:30

[...]

fatal: [localhost]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "allow_unauthenticated": false,
            "autoclean": false,
            "autoremove": false,
            "cache_valid_time": 0,
            "deb": null,
            "default_release": null,
            "dpkg_options": "force-confdef,force-confold",
            "force": false,
            "force_apt_get": false,
            "install_recommends": null,
            "name": "threatstack-agent=2*",
            "only_upgrade": false,
            "package": [
                "threatstack-agent=2*"
            ],
            "policy_rc_d": null,
            "purge": false,
            "state": "present",
            "update_cache": null,
            "update_cache_retries": 5,
            "update_cache_retry_max_delay": 12,
            "upgrade": null
        }
    },
    "msg": "No package matching 'threatstack-agent' is available"
}

It appears that when running in check mode, because ansible doesn't execute anything, if a dependent artifact (like the apt cache being updated to include the threatstack-agent package). If I update the cache outside the role, the check mode will succeed. This seems to be a limitation of ansible itself.

So I'll need to know a bit more about your environment, @patrickjahns, before I can go much further, since I wasn't able to exactly replicate your issue.

patrickjahns commented 3 years ago

Thank you for looking into the issue - I have moved on from the project where the error occured

It was a Ubuntu 18.04 system. In order to trigger the error, the system need to be "fresh" i.e. never having threatstack agent installed. The above mentioned issue will appear, as the apt repository is not added during a check run - but later on it tries to fetch the version from that.

There are two ways forward:

olhado commented 3 years ago

Being somewhat new to Ansible, it seemed to me that --check is supposed to keep things immutable. So the first seems not a good practice. Not sure what you mean about the second step, or how that is different from the first.

I could mark certain tasks to ignore errors in checkmode, but that still outputs errors, just doesn't fail, so I don't see how that would help your use case.

olhado commented 3 years ago

Going to close this for now. Please feel free to re-open.

patrickjahns commented 3 years ago

Being somewhat new to Ansible, it seemed to me that --check is supposed to keep things immutable.

Immutability is IMHO a different concept which translates to the fact, that you are not mutating the state of your infrastructure and instead you are replacing the infrastructure. In terms of virtual machines this means, that you build a Image (i.e. Amazon AMI, cloud-image, iso etc.) and run this - whenever you need to perform a change, you built a new image and spin of the virtual machine with that image. Similar to containers based concepts.

As written upsteam:

In check mode, Ansible runs without making any changes on remote systems.

https://docs.ansible.com/ansible/latest/user_guide/playbooks_checkmode.html

I've seen --check always as a form of dry-run - you can run a playbook (including) this role at any give point in time with this option set, and the playbook/role should finish without failing on any errors and report any possible changes.

Why is this important? Even when you provision your infrastructure the first time - you want to see a list of changes before you are applying those changes to your infrastructure. Furthermore, at any given point in time, you can run the playbook in check-mode to see if there is a state drift in your infrastructure (i.e. some manually changed a file etc.)

olhado commented 3 years ago

I see.

I'm not sure I see a way to have --check work then, since the attempt to install needs the info. I could play with this:

ignore_errors: "{{ ansible_check_mode }}"

-- OR --

checks_mode: true

...but it seems like I'd have to do that to a lot of tasks, rendering --check confusing (actually effect changes in what intuitively feels like a dry-run mode) or useless (swallow errors).

Combine the check and install step so the refresh is implicit

I'm still unclear on what this means in this context.