evrardjp / ansible-keepalived

Keepalived role for ansible deployment
Apache License 2.0
98 stars 98 forks source link

use reload-or-restart for service reload #198

Closed zerwes closed 2 years ago

zerwes commented 2 years ago

Use reload-or-restart for service reload, as this will start the service if it is not running. Avoid errors if the service was stopped before the playbook run and changes in the config will trigger a reload like:

RUNNING HANDLER [keepalived : reload keepalived] ******************************************************************************************************************************************************************
! fatal: [...]: FAILED! => {"changed": true, "cmd": ["systemctl", "reload", "keepalived"], "delta": "0:00:00.006364", "end": "2022-05-02 20:45:26.891074", "msg": "non-zero return code", "rc": 1, "start": "2022-05-02 20:45:26.884710", "stderr": "keepalived.service is not active, cannot reload.", "stderr_lines": ["keepalived.service is not active, cannot reload."], "stdout": "", "stdout_lines": []} !
evrardjp commented 2 years ago

Oh that's a great addition!

evrardjp commented 2 years ago

I will fix the broken tests. Edit: Oh wait, you just did! You're amazing!

zerwes commented 2 years ago

There is one caveat with the flush handler task in the changes ... If you include your role conditionally, like with

- name: run tasks
  hosts: allmyhosts
  any_errors_fatal: true
  roles:
    - role: xyz
    - role: keepalived
      when: "'keepalived' in group_names"

ansible issues a warning

! [WARNING]: flush_handlers task does not support when conditional

due to ansible issue #41313 It is sad that the bug above was closed despite the fact that nearly all considered this as a bug ...

evrardjp commented 2 years ago

Ha! Maybe we have to rethink this... I know that there are ppl using this role conditionally.

Maybe it's time to use the service/systemd module now?

zerwes commented 2 years ago

I use roles conditionally too, that is why such things like simply closing https://github.com/ansible/ansible/issues/41313 makes me kind of angry ... In my way of understanding a configuration management and provisioning tool should be able to leave a host in the desired and described state and a "role" should be completely conditional ... but this is another story ... maybe I am to puppetized ...

So commit e14b518c925d63772753e5ff3d356413244d0f4b is IMHO safe, as it just switches from systemctl restart to systemctl reload-or-restart in the handler ...

Commit c29d67af08664661a49c194b15fade0ea635410a comes in place if the service is not running and the playbook will make no changes, so the handler is not triggered. As the playbook is about having a running keepalived, it ensures a started service as the last task.

Without the flush_handler the worst thing might be that the service might be restarted twice:

Maybe it's time to use the service/systemd module now?

AFAIKS there will be no difference, as only the started/stopped states are idempotent ... except the playbook will look more ansiblish ...

So I can:

evrardjp commented 2 years ago

I will take this a bit myself, use this time to refactor the role a bit.

zerwes commented 2 years ago

As I found the problem and question quite general and interesting I started a short example on it ... https://github.com/zerwes/ansible-ensure-service-running (My intention was to implement some checks and sample runs using github actions too, but this is not yet implemented due to lack of time ...) My conclusion is in fact the register the desired state of the service and apply it way as the best but at the same time the most complicated way ... Sorry, I did not find the time to implement this in a new PR ...

evrardjp commented 2 years ago

@zerwes no worries, I didn't ask you to do that much , you already did quite a lot. Great work!

Your findings closely match my experience.

At some point, I thought of just abandonning the handlers due to their lack of flexibility. Sadly, handlers had a great usage for some customer, in which a play would trigger the role multiple times, targetting different configuration elements (different sub configs), and triggering a unique restart at the end. But as this is sub-optimal anyway, maybe there is something to do there.

I will need to evaluate this , come with creative but simple ideas.

zerwes commented 2 years ago

should be covered by #212