linux-system-roles / firewall

Configure firewalld and system-config-firewall
https://linux-system-roles.github.io/firewall/
GNU General Public License v2.0
57 stars 32 forks source link

Provide example or method to show differences between defined vs in-use configuration #140

Open myllynen opened 1 year ago

myllynen commented 1 year ago

Since replacing previous rules causes firewalld restart it is not suitable for most production environments. In case not replacing previous rules the defined configuration may not be what is in use due to local changes or previous playbook runs with different configuration. An organization might be adhering to security standards that require listing or verifying that only certain ports and services are open in firewall.

There should be a way to display differences between currently defined configuation vs currently in-use configuration. Preferably this would be a configurable step during firewall configuration which would show differences as part of a playbook run after configuring firewall. This should also be doable in check mode. Alternatively, a separate playbook or at least a document example should be available to allow organizations avoid reinventing the wheel here.

Thanks.

richm commented 1 year ago

Since replacing previous rules causes firewalld restart it is not suitable for most production environments. In case not replacing previous rules the defined configuration may not be what is in use due to local changes or previous playbook runs with different configuration. An organization might be adhering to security standards that require listing or verifying that only certain ports and services are open in firewall.

There should be a way to display differences between currently defined configuation vs currently in-use configuration. Preferably this would be a configurable step during firewall configuration which would show differences as part of a playbook run after configuring firewall. This should also be doable in check mode. Alternatively, a separate playbook or at least a document example should be available to allow organizations avoid reinventing the wheel here.

That's what I thought https://github.com/linux-system-roles/firewall#available-ansible-facts was supposed to be used for. Is that not suitable for your purposes?

Thanks.

myllynen commented 1 year ago

Those facts are definitely helpful, however here I'd consider them more as a necessary building block than a complete solution.

An example could be a case where a large number of systems have two zones configured and few rich rules in use. Having readily available Ansible to print out the differences for those servers with differences a readable manner would be very helpful. For instance, if someone opened the HTTP service in addition to the currently configured and expected HTTPS service in the public zone and created an extra rich rule in another zone, how to "easily" and in compact manner provide that information? Thanks.

richm commented 1 year ago

It seems to me that this is what check mode is intended for. Does the firewall role check mode not work for this situation? Or is it that the output of check mode is not easily consumable, and that you would like the data formatted in a more "user friendly" format?

myllynen commented 1 year ago

Sorry for the delay with this one, I somehow missed your questions earlier.

Wrt check mode, I noticed that it doesn't work when replacing previous rules, filed #151 about that.

The check mode works very well when adding something. For instance, consider this simple example:

    firewall:
      - zone: allow-ssh
        permanent: true
        state: present
      - zone: allow-ssh
        service:
          - ssh
        source:
          - 192.168.122.1
        target: DROP
        permanent: true
        state: enabled
      - set_default_zone: drop

If I run a playbook with this configuration, and then add 192.168.1.1 as another source and run the playbook again in check mode, then it works as expected, showing that the zone configuration would change.

However, if I really run the playbook with those two sources (192.168.1.1 and 192.168.122.1) enabled and then after that remove 192.168.1.1 from the variable/sources, the next playbook run will show all ok. But at this point I have no way of knowing whether any additional sources except for the currently defined 192.168.122.1 are allowed or not.

Hopefully this makes it clearer what I was after initially. Thanks.

BrennanPaciorek commented 1 year ago

To confirm, you would like a method to get the firewall's runtime configuration (in-use) vs its permanent (defined) as a way to verify that the in-use configuration is the expected configuration, as well as see these differences in check mode?

For instance, if you used the following instructions:

firewall:
  - zone: public
    source: 192.168.122.1
    target: DROP
    state: enabled

You would want to see an output similar to below, showing the other dropped sources (or even a full summary of the modified zone's active settings):

"public": {
target: DROP
sources: [192.168.122.1,  192.168.122.2, 192.168.122.3, ...]
}

Is this a correct understanding of the issue?

juliaschindler commented 1 year ago

The basis of this issue is the intention to have the full firewall configuration represented by ansible (inventory) variables. In cases where the configuration represented by the variables cannot be enforced anytime by using "previous: replaced", as in production environments, it hence would be good to have a mechanism to compare the configuration as represented by the variables with the actual configuration on managed hosts and get a report of the differences. So when that report showed changes, a maintenance for adjusting the firewall configuration could be scheduled.

The "currently defined configuration" thus refers to the configuration as defined by the ansible variables. The "currently in-use configuration" refers to the actual (runtime + permanent) configuration on the managed hosts.

For example, let's say this playbook contains the full desired configuration for the firewall:

---
- hosts: all
  become: true
  gather_facts: false
  vars:
    firewall:
      - previous: replaced
      - set_default_zone: public
      - service: netbackup
        port: [13724/tcp, 1556/tcp, 13782/tcp]
        short: NetBackup
        description: 'NetBackup backs up and restores files, directories and raw partitions on a server. A server protected by NetBackup is known as a NetBackup client. Enable this option if you use configure netbackup client in the server'
        state: present
        permanent: true
      - service:
        - netbackup
        state: enabled
        permanent: true

  roles:
    - firewall

Running the playbook, the resulting firewall configuration will be

[root@rhel92 ~]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: enp1s0
  sources: 
  services: cockpit dhcpv6-client netbackup ssh
  ports: 
  protocols: 
  forward: yes
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: 
  rich rules:

Let's say somebody manually changes the configuration of a managed host:

[root@rhel92 ~]# firewall-cmd --add-service=https --permanent
success
[root@rhel92 ~]# firewall-cmd --reload
success
[root@rhel92 ~]# firewall-cmd --add-service=http
success
[root@rhel92 ~]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: enp1s0
  sources: 
  services: cockpit dhcpv6-client http https netbackup ssh
  ports: 
  protocols: 
  forward: yes
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: 
  rich rules:

Running the playbook again using --check and --diff, there is no change reported,

[..]

TASK [firewall : Configure firewall] ********************************************************************************************
ok: [192.168.122.118] => (item={'set_default_zone': 'public'})
ok: [192.168.122.118] => (item={'service': 'netbackup', 'port': ['13724/tcp', '1556/tcp', '13782/tcp'], 'short': 'NetBackup', 'description': 'NetBackup backs up and restores files, directories and raw partitions on a server. A server protected by NetBackup is known as a NetBackup client. Enable this option if you use configure netbackup client in the server', 'state': 'present', 'permanent': True})
ok: [192.168.122.118] => (item={'service': ['netbackup'], 'state': 'enabled', 'permanent': True})

[...]

TASK [firewall : Calculate what has changed] ************************************************************************************
skipping: [192.168.122.118]

TASK [firewall : Show diffs] ****************************************************************************************************
skipping: [192.168.122.118]

PLAY RECAP **********************************************************************************************************************
192.168.122.118            : ok=9    changed=0    unreachable=0    failed=0    skipped=9    rescued=0    ignored=0

Running the playbook without --check, it reports

[..]

TASK [firewall : Configure firewall] ********************************************************************************************
ok: [192.168.122.118] => (item={'set_default_zone': 'public'})
ok: [192.168.122.118] => (item={'service': 'netbackup', 'port': ['13724/tcp', '1556/tcp', '13782/tcp'], 'short': 'NetBackup', 'description': 'NetBackup backs up and restores files, directories and raw partitions on a server. A server protected by NetBackup is known as a NetBackup client. Enable this option if you use configure netbackup client in the server', 'state': 'present', 'permanent': True})
ok: [192.168.122.118] => (item={'service': ['netbackup'], 'state': 'enabled', 'permanent': True})

[..]

TASK [firewall : Calculate what has changed] ************************************************************************************
changed: [192.168.122.118]

TASK [firewall : Show diffs] ****************************************************************************************************
skipping: [192.168.122.118]

PLAY RECAP **********************************************************************************************************************
192.168.122.118            : ok=11   changed=1    unreachable=0    failed=0    skipped=7    rescued=0    ignored=0

but there is no information given of what has changed. Also, running the same playbook without "previous: replaced" will not show any differences, because the configuration defined by the variables is part of the actual configuration.

A potential output for the proposed comparison using "previous: replaced" in check mode could be something like

ok: [192.168.122.118] => (item={'set_default_zone': 'public'})
ok: [192.168.122.118] => (item={'service': 'netbackup', 'port': ['13724/tcp', '1556/tcp', '13782/tcp'], 'short': 'NetBackup', 'description': 'NetBackup backs up and restores files, directories and raw partitions on a server. A server protected by NetBackup is known as a NetBackup client. Enable this option if you use configure netbackup client in the server', 'state': 'present', 'permanent': True})
ok: [192.168.122.118] => (item={'service': ['netbackup'], 'state': 'enabled', 'permanent': True})
changed: [192.168.122.118] => (item={'service': ['https'], 'state': 'absent', 'permanent': True})
changed: [192.168.122.118] => (item={'service': ['http'], 'state': 'absent', 'permanent': False})

or another mechanism to show that the https and http services are not part of the configuration as defined by the variables, but are present in the actual configuration on the host.

richm commented 1 year ago

When using previous: replaced - should it show you all of the settings that were erased/reverted to the default value? If so, how?

In cases where the configuration represented by the variables cannot be enforced anytime by using "previous: replaced", as in production environments,

Why can you not use previous: replaced in production environments?

get a report of the differences

Can you provide an example of how you would like the differences to be reported? Because of the way that previous: replaced is implemented, it may be very difficult to report using the typical ansible

ok: [192.168.122.118] => (item={'set_default_zone': 'public'})
ok: [192.168.122.118] => (item={'service': 'netbackup', 'port': ['13724/tcp', '1556/tcp', '13782/tcp'], 'short': 'NetBackup', 'description': 'NetBackup backs up and restores files, directories and raw partitions on a server. A server protected by NetBackup is known as a NetBackup client. Enable this option if you use configure netbackup client in the server', 'state': 'present', 'permanent': True})
ok: [192.168.122.118] => (item={'service': ['netbackup'], 'state': 'enabled', 'permanent': True})
changed: [192.168.122.118] => (item={'service': ['https'], 'state': 'absent', 'permanent': True})
changed: [192.168.122.118] => (item={'service': ['http'], 'state': 'absent', 'permanent': False})

output format.

Also - when using previous: replaced - let's say you have something like this, and the value of some_setting is a_non_default_value:

firewall:
  - previous: replaced
  - some_setting: default_value

When you apply this, the first thing the role does is to process the previous: replaced by resetting all settings, which includes reverting some_setting to its default value of default_value. Then it processes the next item in the list, which is the some_setting: default_value - what should be reported in normal mode here? The way the role works now, it will report ok: changed: false for that item.

juliaschindler commented 1 year ago

When using previous: replaced - should it show you all of the settings that were erased/reverted to the default value? If so, how?

That would be ideal for understanding what settings have changed. The example I already gave in my previous comment would be a nice option, as ansible users would be used to this way of reporting. However, as you say it may be very difficult to report using the typical ansible, maybe a modified version of the facts as given by https://github.com/linux-system-roles/firewall#available-ansible-facts could be outputted.

Why can you not use previous: replaced in production environments?

Firstly, as myllynen commented replacing previous rules causes firewalld restart that is not suitable for most production environments - because of the short period during the restart where not new connections are allowed potentially causing issues for sensitive applications. Secondly, to be in full control of the firewall configuration it is vital to know about any potential changes to ensure proper functionality, for example in the (hopefully rare) case an administrator did some manual change to the configuration and would not update the ansible variables for firewall configuration, running the role using previous: replaced will potentially cause issues due to changed configuration. Having previous: replaced in check mode in production would be a potential option, but currently would not show which changes to the configuration would be done.

Would it be possible to add the runtime configuration to the facts as given by https://github.com/linux-system-roles/firewall#available-ansible-facts ? Then the custom facts could be compared to the ansible variables to get a full picture of what would change.

When you apply this, the first thing the role does is to process the previous: replaced by resetting all settings, which includes reverting some_setting to its default value of default_value. Then it processes the next item in the list, which is the some_setting: default_value - what should be reported in normal mode here?

To be able to understand what configuration the role replaces using "previous: replaced" I would say it will provide more information to compare the value before running the role, a_non_default_value, to the value defined in the ansible variables, than to compare the value defined in the ansible variables with the default settings, so that in this case it would rather report changed: true (and ideally what was replaced).

richm commented 1 year ago

@juliaschindler Thanks for the info.

Can you provide an example of how you would like the differences to be reported?

juliaschindler commented 10 months ago

Hello and sorry for the very late response!

As because of the way that previous: replaced is implemented, it may be very difficult to report using the typical ansible, I thought of comparing the defined variables with the found firewall configuration and printing the differences if any. In the following example, there was done work zone configuration by the role. Then, manually, the http service was added to the work zone permanently. An example of the differences shown for services that are added to zone work but are not defined via ansible variables could be the following (see below). It could be expanded to 1. also show services that should be added to a zone as defined via ansible variables but are not in the actual configuration, and 2. for all zones and other configuration items like sources. Do you think that might be a viable approach to include in the role to show differences even with previous: replaced?

    - name: Gather firewall facts using the firewall role
      ansible.builtin.include_role:
        name: firewall
      vars:
        firewall_config:

    - name: Set the variable that defines what the firewall configuration should look like
      ansible.builtin.set_fact:
        firewall:
          - set_default_zone: work
            state: enabled
          - zone: work
            service:
              - ssh
              - dhcpv6-client
              - https
            state: enabled
            permanent: true
          - zone: work
            service:
              - cockpit
            state: disabled
            permanent: true
          - previous: replaced

    - name: Show differences in actual vs. defined firewall configuration
      ansible.builtin.debug:
        msg: "The following service are enabled for work zone, but not configured via variables: {{ __found_firewall_config | difference(__defined_firewall_config) }}"
      vars:
        __found_firewall_config: "{{ firewall_config['custom']['zones']['work']['services'] }}"
        __defined_firewall_config: "{{ firewall | select('contains', 'zone') | selectattr('zone', 'equalto', 'work') | selectattr('state', 'equalto', 'enabled') | map(attribute='service') | flatten }}"
      changed_when: true
      when:
        - __found_firewall_config | difference(__defined_firewall_config) | length > 0
richm commented 10 months ago

Thanks. Not sure when we can get someone to work on this, but this is helpful.