roles: don't fail on SELinux failures; record them to files

projectatomic / atomic-host-tests

A collection of single-host tests for Atomic Host

GNU General Public License v3.0

18 stars 21 forks source link

roles: don't fail on SELinux failures; record them to files #384

Closed miabbott closed 6 years ago

miabbott commented 6 years ago

A common problem when runing the 'sanity' test is that it can fail very early because a stream has an SELinux denial immediately upon booting the host. While it is helpful to detect this kind of denial, failing the whole test suite near the beginning could let other problems slip by.

This change attempts to avoid the 'early failure' scenario by capturing the various SELinux failures and dumping them to files on the host where the playbook is running. I'm open to changing how the files are named or what not.

I've made a few assumptions about how this works because we are basically the only ones using these roles/tests. To everyone else, YMMV.

miabbott commented 6 years ago

@jlebon Here's the first shot at not failing when we hit SELinux problems.

miabbott commented 6 years ago

That looks sane to me. I assume you've tried it out by e.g. having a mislabeled file? Also, we could mark the playbook as failed still if we just make it the last task, right? (By just checking if the files exist). That way, if we do fail there, we know everything else otherwise was OK.

Yeah, I've tested a mislabeled file and SELinux denials in the journal.

As for marking failure at the end, I don't want to go too far down that path, as @mike-nguyen has been working on a big revamp of how the playbooks will execute which should solve the overall problem of making any failure fatal. To that point, this work might not even survive his revamp, depending on his implementation.

Thanks for the review!

mike-nguyen commented 6 years ago

As for marking failure at the end, I don't want to go too far down that path, as @mike-nguyen has been working on a big revamp of how the playbooks will execute which should solve the overall problem of making any failure fatal. To that point, this work might not even survive his revamp, depending on his implementation.

Hmm I might have to put more thought on how we will handle this one case. In my implementation, the improved sanity test still fails early because it doesn't make sense to continue to test the upgrade and rollback. It is more flexible so I don't think there will be a technical reason why we couldn't do it.

mike-nguyen commented 6 years ago

LGTM. I like the idea of checking for failures at the end but I think the revamp will open more options for looking at failures. I will merge for now if there are no objections and we can continue the discussion when the revamp PR comes out.

mike-nguyen commented 6 years ago

Merging!