Some questions and thoughts about the project

relaxdiego commented 8 years ago

Thanks again for presenting this at AnsibleFest16, Mike! I have a couple of questions.

The model I will be using for testing in the Ansible world is as follows. Note how it's a direct analogy of testing from the software development world. Based on what I heard from your talk at AnsibleFest, I believe we hold the same model.

https://dl.dropboxusercontent.com/u/1355795/ansible-testing.JPG

My concerns are as follows:

POSSIBLE DUPLICATED LOGIC? Initially, I liked the clean analogy from the software dev world to the Ansible world on the Developer column. However, after some chewing on it a bit more, I started having doubts. Since Ansible is a declarative DSL, the primary responsibility of checking that a task has been successfully satisfied falls on the Ansible module being used for that task. Assuming that the module is working as expected and we're not abusing the command/shell module (which has very rudimentary success/fail checking), wouldn't these separate test playbooks be duplicating the effort?

So if we were to take the software dev to Ansible world analogy (Developer column only), it would be like the Developer writing unit tests against the languages standard library which is just duplicated effort.

WHAT IF WE ONLY WRITE "ACCEPTANCE PLAYBOOKS"? What if we adjust the tests scope and treat the subject under test as a black box? What if instead of checking for the existence of internal files and so on, we check only for things that the consumer of the subject's service would be able to confirm? In this case, the test playbooks avoid duplicating logic while still providing value to the project.

WHAT IF WE BORROW IDEAS FROM OTHER DECLARATIVE LANGUAGES I couldn't find a lot of resources on this, admittedly. However, deriving analogies from CSS testing tools might be a good start. Here's one that talks about it: http://csste.st/guides/declarative-languages.html

relaxdiego commented 8 years ago

Something seems to be wrong with that dropbox link. If that's still the case, try: http://imgur.com/02QOGHv

MikeCaspar commented 8 years ago

Hi there.

I am glad you enjoyed the session. Thanks.

I appreciate your detailed and thoughtful questions and I'll do my best to respond.

I can see your "duplicated logic" message. I wish however, it was that simple.

As we know, the modules are idempotent. This means they will do the wrong thing over and over again. (based on what was coded). More importantly, what we need to know is if the developer does change the playbook, it will still "leave the system passing previous tests". This is where the true benefit is.

As you recognized, they may be more considered "Acceptance Playbooks". Following a Test First approach still works here. This is why the sample modules don't fail on the first test. They list all the tests that fail. This way, one or many tests (whatever they do), can be created ahead of time (or in parallel).

This is also why in the loop, I have called them governance in the slides. You could theoretically call them _acceptance tests if that makes you more comfortable in your environment. For me, it's about the concept of separating confirmation from code, and also that someone will KNOW when they break something. This implies a certain level of duplication.

I too struggled a bit with not being able to use the existing modules with perhaps another attribute or flag of some sort. I am considering a PR to one of the modules to see if I can get a "read-only" or "test-only" or "expect" type parameter but I think we're too early for that. Also, I wanted to make sure that anyone could use this approach without needing to learn Python and use the simple DSL.

The way the testModules work right now, anyone can modify or create one for their specific need.. I envision that someday (if it makes sense and can be done, someone could simply change the work of that role to use an existing Module for real (with an additional parameter of some sort). This would not have to break anything and would auto-inherit all changes made to modules.... That would be a big relief in the future for sure.

I am a bit curious about your distinction between QA and developer. This is a little different from my ideal world. I see devs and QAs working together and that the goal is to have tests that can replace the need to manually check a system after the playbooks are written. Not a big deal, but I thought I'd mention it. I can see however, that this would be a pretty common model for many people. I could ask you this. If this is your model today, would those same people not have separate tests to run which would be a duplication. Sorry. I'm not 100% sure if you are concerned about duplication of effort of the team or duplication of effort to write these test roles. Hopefully I haven't gone totally out of line on you.

Let's assume I keep the diagram as-is. I would say that the "verifies internal system state", would really be covered by the _test playbook (ideally created together between dev and QA). (or in QA as you say). We know the playbooks will do exactly what is coded and it will be idempotent. So, based on the diagram I might (I say might) move those last tests in column A to both column A and column B as if the tests are accurate it would already cover both columns. Does that make sense.

Either way, I DEFINITELY see roles under this model doing all kinds of things that are more than just in the playbooks themselves.

Examples of things that I have seen go wrong and ideas that follow your thinking.

Unzip of files to update system..always unzipping to wrong folder...would be discovered by a _test module that makes sure a link exists or that a --version gives the correct result from an application (probably more along the BDD or ATDD dev space).
After the playbook for application X is executed, the main server can ping all slaves and can ssh to them.
After the playbook for application Y is executed, the DNS resolves correctly. (acceptance again).
Every system should have SSH key Blah in the user X.
user X should not be in SUDO

With both of these last two examples, these _test playbooks could be created before, during or after development.. .However, ideally before in a collaborative way somehow. The intent is about mindset.

No matter how detailed we get, we need to know that when we DO change a playbook, we haven't broken something else from before. Unfortunately, we could idempotently do something while breaking something else.

As we know, every environment and situation is different. I think .. what makes sense for you is the right thing to do. I struggled with decided on TDD, BDD, ATDD, ... none seems to fit perfectly. The sample playbook in this base framework seems to imply a BDD oriented approach more than TDD.. But that's not totally appropriate either ;->

I my case, I have some public (to share) and some private testRoles. Some of them use command to do things that are simply not there in the standard modules. The goal is "testing", not reproducing the idempotency of existing modules but to test based on our specific needs.

I am hoping (in agreement about the idea of not reproducing ansible modules) that someday , we might issue a PR which would allow "commands" to be replaced with smarter usage of ansible actual modules with a "test-only" type flag.. That being said, there's no reason, we'd need to change the overall mental design.

For now, I think I'm going to get some sample modules in and could use help with awesome ideas like you have. I even hope that some day, version 5 or something will have a Cucumber type language format. This was as close as I could get for now.

One last comment.. before the code makes it to the Repo for a ROLE, there is --syntax check.

Based on the link you sent.. I would suggest we're in between worlds.

The _test playbook is declarative in nature.

The role itself is closer to TDD (and might even be written by a dev for a QA person.. though.. in your diagram it would be written by QA).. This is the thing that would be in both columns.

If someone is doing an ATDD only approach, or a governance approach, then the _test playbook would also serve that purpose.. Which was kind of the point.. the same tests serving the developer, the QA and Governance all-in-one.

Gees... I hope I did your questions justice. If not, feel free to reach out.. we could perhaps do an in person call, and then I could follow it up with a post for others to read where we ended up.

If I have misunderstood in some way or missed an important point.. Again, please feel free to reach out.

MikeCaspar commented 8 years ago

@relaxdiego Should you wish to help with documentation (ie: adding your diagram with some notes), let me know.. Perhaps you submit a PR outlining an describing what you learned here. (or even just cut-paste what's here if you like)... I think you had some great questions and observations.. It be great to talk about "context" in the docs somewhere.

MikeCaspar commented 8 years ago

@relaxdiego . Hey, if you'd like to give it a try, perhaps you could write a role (just grab one that works that tests something).. Could be something as simple as testForMachineIsReachable (name:) .... or whatever makes sense for you. It could even be a role to make sure certain versions of libraries exist on a server. This was if a playbook accidentally changes something, everyone would know right away (including the developer). The idea is to know as quickly as possible that you broke something from the past.

MikeCaspar commented 8 years ago

Hi.. not sure if I answered everything.. In some cases, a developer might right "verifies system internal state" using a role.. that same role would later be run to ensure they don't mess something up, and all do governance for the future state.

MikeCaspar commented 8 years ago

@relaxdiego One more thought.. Hey.. these are all great questions ! Obviously a bit of work is needed on the docs. We don't want to execute the _test using the same modules. As you said, we don't want to confirm that ansible modules work as written.. What we DO want to check is that we've asked the ansible module to do the right thing.

I can think of an instance where I watched a system engineer spend about 3 hours figuring out why a system didn't work only to discover they accidentally changed a playbook to create a user callled 'depploy' instead of 'deploy' . That's the type of thing I am hoping to help with that follows this type of approach (not really bdd, not really TDD). Maybe we should call it APTDD (Ansible Playbook Test Driven Development) .

MikeCaspar commented 8 years ago

Mark, based on some of your feedback, I've added this page to the WIKI (and also a link to it from the wiki home page).

https://github.com/MikeCaspar/playbook_test_framework/wiki/Alternate_Test_File_Names https://github.com/MikeCaspar/playbook_test_framework/wiki

I would encourage you to use whatever name works for you and in your environment.

MikeCaspar commented 8 years ago

No matter what you call your playbooks, please consider the design guidelines so that we don't see hundreds of variations of these two small pieces (at least for now).

relaxdiego commented 8 years ago

Sorry for not responding sooner, @MikeCaspar. I just did an internal brownbag session talking about your project and the response from the team was positive. My manager also wanted to know how we can adopt this going forward. I'll talk about that some more later in this response. For now, my response to your question:

I could ask you this. If this is your model today, would those same people not have separate tests to run which would be a duplication. Sorry. I'm not 100% sure if you are concerned about duplication of effort of the team or duplication of effort to write these test roles. Hopefully I haven't gone totally out of line on you.

In my case, my concern was about writing a "unit test" playbook that duplicates the logic in the playbook being tested. Actually, when I think about it now, it's more than just the concern of duplicated effort. Let me explain further through my experience writing and testing Rails models. Take for example this model:

class Person < ActiveRecord
  validates :first_name, length: { minimum: 2 }
end

First timers will usually test it with:

describe Person do
  it { should ensure_length_of(:first_name) }
end

Which is not a good thing because 1) you're testing whether you actually wrote the validation in the model (duplicated effort due to testing for things internal to the subject) and 2) your test is tightly coupled with the implementation. If you were to use a different (perhaps more improved) validator in the future, you will unnecessarily have to modify the test. However, if the test was written as:

describe Person
  it "doesn't allow creation of a person without a name" do
    expect(Person.create(first_name: "")).to eq(false)
  do
do

While you're still testing the same feature, your test is now descriptive and also does not make any assumptions on how the feature is implemented.

Translating this to Ansible playbooks, a feature we might test is whether the web UI is responding with the correct status code or the correct content. We would not be testing whether the web server has a certain configuration file in a certain path since that would be equivalent to the first test code above which tests internal implementation.

That's really the bulk of my concern. I just want to make sure that, when I write a test/acceptance/governance playbook that I write it with the correct abstraction in mind so that it doesn't end up being brittle/flaky.

Moving to the comments from today's brownbag session:

There was some initial confusion from those new to Ansible that this was about using playbooks to test a module. I don't know if this will be a common confusion down the line for newcomers but just giving you a heads up in case it turns out to be so.
Comment on the test playbook of the ansibleFest2016SFO project: instead of putting "name" on the first task of each block, put it on the assert task. This makes for a more readable output when the playbook is executed. However, having played around with the test roles you created, we saw that you've re-organized the tasks to provide more helpful outputs.
When using testFor* roles, it's helpful for TEST_FAILED to actually register as failures in Ansible. In testForFolder, I found that adding failed_when and ignore_errors does exactly what I need. I added ignore_errors so that the playbook doesn't stop for a host if a failure is encountered. Perhaps this can even be parameterized so that the playbook author can choose whether to stop on first error or continue.


- debug:
    msg: "TEST_FAILED: with path {{ path }}, expected {{ expected }} "
  failed_when: ansibletestFolderTestPassed == false
  ignore_errors: true
  when:
    - ansibletestFolderTestPassed == false

MikeCaspar commented 8 years ago

Wow... Great observations. The intial SFO presentation was at the most simple level.

I realized that people would quickly need something easier to use.

In reality, what we really want is to know that whatever happens in "the playbooks" (not the test ones), the system is in the desired end state (tested not changed (read-only).

I had wondered if some people would think we are testing modules.. I think that the module vs. role concept starts some of that confusion... We're not trying to test the roles themselves.. This is where it gets weird though.. we are creating roles (for testing). They may be public test roles (used by many) (such as the ones I created already). I envision someone create a private testOurSecurity role some day that is private (so the abstraction level discussion gets murky).

I agree 100%... No, the idea is not to test modules. We don't want to reproduce the fine work done by people creating those roles. Also, they have plenty of testing already.

I do like your third code example. describe Person as it is likely how people should be using this approach in most cases. In this case you are checking that a person exists. I can see a different test for those that think BDD, ATDD that says "We need to make sure there is X # of people). Do you write a separate role that calls the base test role. I am hoping things don't get that complicated. Nothing is stopping you from trying.. I do share your concern though. As with anything new, it can morph into anything. I would prefer to keep it simple for now and try and keep the test roles as simple as possible. they could expand over time. Hopefully the ones I have created are good representations.

I imagine that if we don't check the results of a Private Playbook we created, people will get burned with :--then and -import without realizing it. I will always need to remember the original and vision/purpose.

You'll notice the sample roles I created need to know only minimum details to work. None of the internal details are known (and their implementation might even change).. That would not make the test invalid.

As for the final debug:.

I had started with this fail fast approach, but found it really didn't do the right thing (for a team). Let me explain. If one thinks (test FIRST, not test after), there's no way to make sure your tests are going to work in advance or build them out (or it's VERY difficult).

I found that when I had this, I would create 10 tests, the first one would fail, then the _Test playbook would stop. Then.. I'd go and write the _maintain to deliver that test to a pass state.. Then, I needed to work through the next test.

Ironically, I like to work in a test/develop/test/develop pattern, but when trying to let someone create some tests in advance, it was problematic and almost impossible to get to work.

If you think of a cucumber script as you show above, imagine that when you run all your tests, but only one failure appears at a time. There was no way of knowing what other tests are passed or failed.

This is why (for now at least), I'd like to keep it this way.. However, I really am hearing what you are saying....

A good alternative might be (let me think about it overnight), would be to add an optional parameter (defaulted to false) that is something like "fail_immediate" is a design decision for any testRole.

I'll maybe make the design change on the weekend.. you could help if you would like).. We could change the Design Requirements to allow an parameter to fail_immediate (yes, no), (defaulted to NO). This way the complete test output will continue to show from a _test playbook when they call the testRole.

But if someone wants to have one specific testRole to stop immediately, (or all of them), they don't have to change testRoles.... they only have to specify the non default _failimmediate: true to their playbook request. I am concerned about every testRole working differently and I DO want to take your input in.

I'll adjust my samples on the weekend to match this new pattern, and we can see how well that works. I have a feeling we can keep it simple AND let you do quick fails if you want. I can even imagine someone choosing which things fail immediate and which don't in a very easy way if we use this approach! Thanks for that.

Awesome ideas. Thanks for taking this to your team.. I realized also today that someone could put the _test or _governance or _accept playbooks into Tower as well :-> I can't wait until someone realizes that :->

Please thank your team for me for the feedback and consideration about making it clear that the goal is not to test modules or how they work.. We're trying to come back one level from that to make sure that when a developer changes a playbook, (or one of their roles), the parts of the system that rely on everything being the same way, still get the expected END results...

Have a great one.

MikeCaspar commented 8 years ago

@relaxdiego .

Hi there. I had a few minutes and tried something quick.

I pushed new code to github (not released as a version) ..

The testRole called testForFolder has an extra param called immediate_exit_on_fail:

It is defaulted to false (to keep the current method working by default)

If you want to give it a try (before we change in too many places), pull the repo testForFolder.

From that repo, you can execute the local test playbook (think of it as self-testing my own module) ... It most just checks syntax until there's a better way.

Anyways, if you pull testForFolder and execute ansible-playbook tests/role.yml you will see (and can try out a few alternatives).

The idea of course is if this works out, I'll change the master design specs over the weekend. See if this works for you the way I changed it. You can find the actual code simulating what you explained as followed...

- debug:
    msg: "TEST_PASSED: with path {{ path }}, expected {{ expected }} "
  when:
    - ansibletestFolderTestPassed == true

# failed tests either debug a failed message or fail immediately (as requested by the calling _test playbook

- debug:
    msg: "TEST_FAILED: with path {{ path }}, expected {{ expected }} "
  when:
    - ansibletestFolderTestPassed == false

- fail:
    msg: "TEST_FAILED: with path {{ path }}, expected {{ expected }} "
  when:
    - ansibletestFolderTestPassed == false
    - immediate_exit_on_fail == true

In this example, note that fail: is called instead of debug:

Hopefully this helps.

MikeCaspar commented 8 years ago

Just in case I misunderstood your ask, let me know. Take a look at what I changed and let me know if you'd like me to try something specific on these last three :when and as long as it doesn't change behavior, I'm easy to try anything.. It may make sense to not get a server error and deal with that.. Perhaps each testRole will work differently.. I'm going to stick with the abstraction (interface) and suggest debug: failed or passed as the requirements and immediate stop if requited.. I feels OK to me that however you create that internally should work. Perhaps you might try a testModule of your own and we can see if we can just integrate it ?Thoughts ?

relaxdiego commented 8 years ago

@MikeCaspar I think you're on the right track here! Also, I was thinking of writing a test role but still haven't decided what to test on. Watch this space!

MikeCaspar / playbook_test_framework

Some questions and thoughts about the project #1