ansible-collections / community.general

Ansible Community General Collection
https://galaxy.ansible.com/ui/repo/published/community/general/
GNU General Public License v3.0
823 stars 1.52k forks source link

capabilities module is not idempotent #4067

Open exploide opened 2 years ago

exploide commented 2 years ago

Summary

The capabilities module is not idempotent. Two consecutive runs of the same task always result in a reported change.

Issue Type

Bug Report

Component Name

capabilities

Ansible Version

$ ansible --version
ansible [core 2.12.1]
  config file = None
  configured module search path = ['/home/user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/user/Desktop/venv/lib/python3.10/site-packages/ansible
  ansible collection location = /home/user/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/user/Desktop/venv/bin/ansible
  python version = 3.10.1 (main, Jan 10 2022, 00:00:00) [GCC 11.2.1 20211203 (Red Hat 11.2.1-7)]
  jinja version = 3.0.3
  libyaml = True

Community.general Version

$ ansible-galaxy collection list community.general

# /home/user/Desktop/venv/lib/python3.10/site-packages/ansible_collections
Collection        Version
----------------- -------
community.general 4.3.0  

Configuration

$ ansible-config dump --only-changed

OS / Environment

Fedora 35

Steps to Reproduce

$ ansible -i localhost, -u root -m capabilities -a 'path=/tmp/foo capability=cap_sys_chroot+ep state=present' all
localhost | CHANGED => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": true,
    "msg": "capabilities changed",
    "state": "present",
    "stdout": "",
    "stdout_lines": []
}

$ ansible -i localhost, -u root -m capabilities -a 'path=/tmp/foo capability=cap_sys_chroot+ep state=present' all
localhost | CHANGED => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": true,
    "msg": "capabilities changed",
    "state": "present",
    "stdout": "",
    "stdout_lines": []
}

Expected Results

After the first run, the capability is already set, so the second run should not report changed.

Actual Results

All runs report a changed state.

Code of Conduct

ansibullbot commented 2 years ago

Files identified in the description:

If these files are incorrect, please update the component name section of the description or use the !component bot command.

click here for bot help

ansibullbot commented 2 years ago

cc @natefoo click here for bot help

felixfontein commented 2 years ago

On my system, passing capability=cap_sys_chroot=ep fixed the issue. Considering https://github.com/ansible-collections/community.general/blob/main/plugins/modules/system/capabilities.py#L36-L39 the behavior you are experiencing seems to be expected (if I run getcap /tmp/foo I see = instead of +).

exploide commented 2 years ago

This indeed solves the problem, but only for this particular example.

Concerning the sentence of the documentation you mentioned, I'm not sure if this is generally comprehensible. Maybe the examples section should use = instead of + to point people towards writing idempotent tasks.

But back to the actual issue. It does not fix the issue in the following case (maybe it is because of multiple capabilities being given?):

$ ansible -i localhost, -u root -m capabilities -a 'path=/tmp/foo capability=cap_chown,cap_fowner,cap_dac_override=ip state=present' all
localhost | CHANGED => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": true,
    "msg": "capabilities changed",
    "state": "present",
    "stdout": "",
    "stdout_lines": []
}

$ ansible -i localhost, -u root -m capabilities -a 'path=/tmp/foo capability=cap_chown,cap_fowner,cap_dac_override=ip state=present' all
localhost | CHANGED => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": true,
    "msg": "capabilities changed",
    "state": "present",
    "stdout": "",
    "stdout_lines": []
}
felixfontein commented 2 years ago

Concerning the sentence of the documentation you mentioned, I'm not sure if this is generally comprehensible. Maybe the examples section should use = instead of + to point people towards writing idempotent tasks.

It probably depends on the version of getcap which form results in idempotent behavior.

But back to the actual issue. It does not fix the issue in the following case (maybe it is because of multiple capabilities being given?):

What does getcap /tmp/foo return in this case?

exploide commented 2 years ago

What does getcap /tmp/foo return in this case?

The following:

$ getcap /tmp/foo

$ ansible -i localhost, -u root -m capabilities -a 'path=/tmp/foo capability=cap_chown,cap_fowner,cap_dac_override=ip state=present' all
localhost | CHANGED => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": true,
    "msg": "capabilities changed",
    "state": "present",
    "stdout": "",
    "stdout_lines": []
}

$ getcap /tmp/foo
/tmp/foo cap_chown,cap_dac_override,cap_fowner=ip

$ ansible -i localhost, -u root -m capabilities -a 'path=/tmp/foo capability=cap_chown,cap_fowner,cap_dac_override=ip state=present' all
localhost | CHANGED => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": true,
    "msg": "capabilities changed",
    "state": "present",
    "stdout": "",
    "stdout_lines": []
}

$ getcap /tmp/foo
/tmp/foo cap_chown,cap_dac_override,cap_fowner=ip
felixfontein commented 2 years ago

Basically that shows that you have to pass cap_chown,cap_dac_override,cap_fowner=ip to the module for it to be idempotent. I guess it orders the capabilities alphabetically. That's pretty much what https://github.com/ansible-collections/community.general/blob/main/plugins/modules/system/capabilities.py#L38-L39 says.

I guess it would be great if the module could do some comparison / normalization to avoid such issues, but I know way too little to say whether this is feasible or not.

exploide commented 2 years ago

Changing the ordering does also not fix the issue:

$ getcap /tmp/foo

$ ansible -i localhost, -u root -m capabilities -a 'path=/tmp/foo capability=cap_chown,cap_dac_override,cap_fowner=ip state=present' all
localhost | CHANGED => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": true,
    "msg": "capabilities changed",
    "state": "present",
    "stdout": "",
    "stdout_lines": []
}

$ getcap /tmp/foo
/tmp/foo cap_chown,cap_dac_override,cap_fowner=ip

$ ansible -i localhost, -u root -m capabilities -a 'path=/tmp/foo capability=cap_chown,cap_dac_override,cap_fowner=ip state=present' all
localhost | CHANGED => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": true,
    "msg": "capabilities changed",
    "state": "present",
    "stdout": "",
    "stdout_lines": []
}

$ getcap /tmp/foo
/tmp/foo cap_chown,cap_dac_override,cap_fowner=ip
felixfontein commented 2 years ago

Ok, I looked in more detail at the module code, it looks like the module does allow you to set a capability (as in: singluar), while you are asking it to set three capabilities at once. The idempotency check really assumes that you pass a single capability in.

So probably if you run the module three times with cap_chown=ip, cap_dac_override=ip and cap_fowner=ip, it should work fine.

exploide commented 2 years ago

Thanks for looking into it. You are right, running this module three times works and successive runs do not report a change afterwards.

However, I'm wondering if this is reasonable behavior. At least it is a bit cumbersome, given that setcap supports setting multiple capabilities at once. Maybe it can be improved in the future.

felixfontein commented 2 years ago

I think it definitely should be improved, the current behavior is pretty basic.

joelnb commented 2 years ago

I also have a problem with the idempotence of this module which i am not sure how to resolve so would be interested in any suggestions.

I maintain a role which needs to support running against multiple OS versions. Currently I have this task:

- name: Set capability on the binary file to be able to bind to TCP port <1024
  community.general.capabilities:
    path: "{{ caddy_bin }}"
    capability: cap_net_bind_service+eip
    state: present
  when: caddy_setcap

This currently reports as changed when running a second time on Ubuntu 22.04 (which has the version of getcap with different output). I have tried changing it to capability: cap_net_bind_service=eip which fixes the issue on Ubuntu 22.04 but then the same issue is present for older OS versions.

Example output from getcap on 22.04:

root@82e91ce85fa1:/# setcap cap_net_bind_service+eip /usr/bin/pinky
root@82e91ce85fa1:/# getcap /usr/bin/pinky
/usr/bin/pinky cap_net_bind_service=eip

Example output from older Ubuntu:

root@24d99943cecc:/# setcap cap_net_bind_service+eip /usr/bin/pinky
root@24d99943cecc:/# getcap /usr/bin/pinky
/usr/bin/pinky = cap_net_bind_service+eip

Also true when providing the = form to setcap:

root@24d99943cecc:/# setcap cap_net_bind_service=eip /usr/bin/pinky
root@24d99943cecc:/# getcap /usr/bin/pinky
/usr/bin/pinky = cap_net_bind_service+eip

The issue is pretty clear from the code - it is building a tuple from the getcap output & comparing it to what I specified. Because the built tuple differs based on the getcap output on each platform there is nothing I can specify which will always match.

I am happy to have a go at making the required changes to the module but would be interested in people's views on what those might be.

ansibullbot commented 1 year ago

Files identified in the description:

If these files are incorrect, please update the component name section of the description or use the !component bot command.

click here for bot help

natefoo commented 1 year ago

I keep forgetting to find the time to come back to this. I did do a bit of an overhaul a while back that I never finished:

https://github.com/natefoo/ansible/tree/capabilities-module-rework

The gist of it is that a "changed" state is determined if the getcap output from after setcap is run differs from the getcap output from before.

gilou commented 10 months ago

Hi, I stumbled upon this once again in 2024… found https://github.com/ansible/ansible/issues/18911 again, then this. And found that I had workaround for older use of that module… which is still not idempotent and registers a change on core 2.16.2, collection 8.1.0.

sverrehu commented 1 week ago

Just for the record, here's the output of setcap cap_net_raw+ep ./foo; getcap ./foo for various distros/versions. (The result is exactly the same for setcap cap_net_raw=ep, i.e. using an = instead of a + when setting.)

=== ubuntu:20.04 ===
./foo = cap_net_raw+ep
=== ubuntu:22.04 ===
./foo cap_net_raw=ep
=== ubuntu:24.04 ===
./foo cap_net_raw=ep

=== debian:sid ===
./foo cap_net_raw=ep
=== debian:bullseye ===
./foo cap_net_raw=ep
=== debian:bookworm ===
./foo cap_net_raw=ep

=== alpine:3.19 ===
./foo cap_net_raw=ep
=== alpine:3.20 ===
./foo cap_net_raw=ep

=== almalinux:8 ===
./foo cap_net_raw=ep
=== almalinux:9 ===
./foo cap_net_raw=ep

=== centos:7 ===
./foo = cap_net_raw+ep
=== centos:8 ===
./foo = cap_net_raw+ep

=== fedora:32 ===
./foo = cap_net_raw+ep
=== fedora:33 ===
./foo cap_net_raw=ep
=== fedora:40 ===
./foo cap_net_raw=ep

It appears that versions of getcap that inject an = after the file name (identified as "older version of libcap" in the code), uses + instead of = for the flags.

sverrehu commented 3 days ago

Maybe it would be easier to implement the change prediction by using the -v (verify) flag to setcap.

$ sudo setcap = ./foo

# before setting caps
$ setcap -v "cap_net_raw=epi cap_net_bind_service=ep" ./foo
./foo differs in [pie]

# set caps
$ sudo setcap "cap_net_raw=epi cap_net_bind_service=ep" ./foo

# after setting caps
$ setcap -v "cap_net_raw=epi cap_net_bind_service=ep" ./foo
./foo: OK
sverrehu commented 2 days ago

I don't think this module is easily fixable as long as it has support for state: absent. Getting the absent/undo logic right for combinations of +, - (in particular) and = is non-trivial, and I suspect there is no consensus for how it should be done.