dmsimard / ara-archive

This repository is an historical archive of https://github.com/dmsimard/ara, please use https://github.com/openstack/ara instead.
https://github.com/openstack/ara
GNU General Public License v3.0
121 stars 19 forks source link

Change request: make ara fail-safe for playbooks #124

Open dbazhal opened 6 years ago

dbazhal commented 6 years ago

Hi there. Here are my thoughts. There should be config option to let ara fail silently - if it fails to do or log something, it should not break the playbook it is used in. Option would be usable in production-critical playbooks - ara problems(like insufficient disk space at ara db) should not stop my playbooks main functionality - configure production systems.

sca- commented 6 years ago

+1 must-have

dmsimard commented 6 years ago

Hi, thanks for the feedback.

I feel like this would be a shared responsability between Ansible itself and ARA. Ansible shouldn't let a callback that is misbehaving interrupt a running playbook.

As far as I am aware, this is currently the case -- throughout the development of ARA, there has been multiple times where ARA was erroring out with a stack trace and it didn't prevent Ansible from running. You can see an example of this here: http://logs.openstack.org/37/481837/13/check/gate-ara-integration-py35-latest-ubuntu-xenial-nv/5f02d97/console.html.gz#_2017-07-20_01_00_09_924097

In case of timeouts, or latency related issues, it's a bit more nuanced. I think Ansible will not give up until the callback has returned something. That's worth exploring.

In the context of ARA, we're not enforcing any kind of timeouts right now so I guess we are essentially using defaults provided by sqlalchemy, whatever they are. Putting timeouts too low is something we have to be wary of, but we could surely do a better job at exception handling.

dmsimard commented 6 years ago

I've created an issue for this upstream: https://github.com/ansible/ansible/issues/27705