cucumber / godog

Cucumber for golang
MIT License
2.22k stars 250 forks source link

Why was BeforeFeature hook removed? #335

Closed jsliacan closed 3 years ago

jsliacan commented 3 years ago

Is your feature request related to a problem? Please describe. Every Feature file can be a separate topic. So it makes sense to start each Feature with a clean slate. A BeforeFeature function (analogous to BeforeSuite, BeforeScenario, and BeforeStep) would be useful. Given that it's not implemented (was removed) while the before/after hooks exist for the other logical units (suite, scenario, step), I assume there is a good reason for this. Would someone enlighten me please? Thanks!

Describe the solution you'd like I'd be very happy even if this was simply reverted :) https://github.com/cucumber/godog/commit/1d724e210776108703db1401f6ba9ff7fa55e4b3#diff-9097c1e5609e1c3fd2542fc447e3b3f7

Describe alternatives you've considered

  1. I considered elevating Scenario to the level of Feature, but what's the purpose of Feature then? Similarly, creating a separate Suite for each Feature doesn't make sense either (why have Features at all then?).
  2. Another solution is to create a Scenario that will be run at the start of each Feature and that will do the clean-up/prep job. But again, it's so inconsistent with the way other units are treated that it doesn't look right. Most importantly though, the person who writes the Feature file needs to have the knowledge about what's required to set up and prepare -- often irrelevant to the user story being tested.

Additional context I saw PR #312 but it doesn't reference any issue where deprecation of (what I think is a very useful hook) was discussed. Just to reiterate why I'm so surprised: there exist a few "abstraction" levels (Suite, Feature, Scenario, Step) and for all of them except for Feature, before/after hooks exist. While the hooks would be plenty useful for me, wouldn't it be better to keep them there just for consistency even if not everyone finds them useful?

gbraad commented 3 years ago

I went over some of the issues and noticed the following:

the new maintainers prefer this to be closer to the other implementations of cucumber, and focus on concurrency of scenarios. The idea that this has to be like the other implementations makes some sense, but there is no need to remove functionality.

The question is how we want to handle BeforeFeature and AfterFeature.

Why is concurrency essential? In most scenarios, where clear user steps are being handled, which are against a prepared environment, concurrency is not wanted.

Alignment is good, but if the idea is that you can take a feature file from another implementation of cucumber and run it is not possible anyway, as you will always need glue-code. This BeforeFeature introduces very helpful hooks to ensure environment preparation.

All our tests for minishift and crc run with https://github.com/code-ready/clicumber/ and rely on the fact that the environment is prepared in the BeforeFeature steps. Sure, it is time consuming, but this is essential as we are spinning op new VMs to ensure the cluster is created properly.

mpkorstanje commented 3 years ago

@jsliacan @gbraad

Cucumber is a tool to support BDD. However looking at your "End-to-end health check" I think you both have a fundamental misunderstanding about what that means.

Rather then illustrating your feature with scenarios/examples in terms of your domain (typically business or domain language) you've written a test script that describes a sequence of relatively low level operations. And each scenario in this script more or less depends on the correct execution of the previous scenario.

This is unfortunately a common misunderstanding. So to clarify; the execution unit of Cucumber is a scenario or an individual example from a scenario outline and each scenario should be independent. Features on the other hand are merely organizational units. They group topical scenarios mostly for the readers convenience and are not an execution. This avoids a whole host of typical test failure modes. That all scenarios in a feature are topical doesn't mitigate this. On the contrary, because they are topical accidental dependencies are much more likely. As such feature hooks are an anti-pattern and a big mistake, they promote and facilitate dependent scenarios.

So with this in mind your feature seems to cover but a single scenario:

Given the CRC is running
And a local image is created
When the local image is pushed to the OpenShift registry
Then the local image is deployed to OpenShift 
And the local image is running on OpenShift

If you have scenario specific resources that need to be cleaned up you could consider tagging the scenario and using conditional hooks to check if a resource should be cleaned up or reset.

So I would suggest that you redesign your test with this in mind or alternatively consider a tool that executes test scripts.

You may want to reference:

gbraad commented 3 years ago

LOL, @mpkorstanje you couldn't have picked a worst example. I totally agree that these tests would need to be rewritten. these features are very interdependent and can therefore never run out of order :-/

But this is not the point I brought up. The BDD tests I talk about are those testing the behaviour as follows:

Given a developer
 When using crc to deploy the `demo` application
  And checks status with `oc status`
 Then he/she should see `demo is running`

We would generally want to set up the environment crc start before the feature is tested, as the startup itself can take quite a while. Without a beforeFeature it seems we would need to rewrite with the when being "starting crc to set up the cluster", and and he/she deploys the demo application.

Given a developer
 When he/she uses `crc start` to set up the cluster
  And deploy the `demo` application
  And checks status with `oc status`
 Then he/she should see `demo is running`

In other words, we move all the dependent and identical code into the actual tests.

mpkorstanje commented 3 years ago

You can use a Background element to extract out the common steps. This does mean each scenario will start an empty cluster.

Background:
  Given a developer
  And he/she uses `crc start` to set up the cluster

Scenario:
  Given he/she uses the crc to deploy the `demo` application
  When checking the status with `oc status`
  Then he/she should see `demo is running`

You can use conditional hooks (not written in go; not fluent enough yet) to act before each scenario tagged with @empty-cluster. If you tag the feature the hook should run before every scenario in the feature. The hooks are written in code so we add some smarts here to ensure we reuse some resources if they're already present.

@empty-cluster
Feature:
  Scenario:
    Given he/she uses the crc to deploy the `demo` application
    When checking the status with `oc status`
    Then he/she should see `demo is running`
before scenario:
  if pickle is tagged with @empty-cluster:
    if a developer is absent:
     create a developer
    if cluster not running:
      start cluster
    remove all deployments

Though I think both these scenarios are still quite noisy. I don't the existence of a developer or an empty cluster are very important to the thing we're testing which is the oc status command. These are incidental details.

If the Given step is made more declarative it can imply that there is an empty cluster with a sole application. In the implementation of given step you can check for a running cluster, removing all prior deployments and deploying the demo app.

Scenario:
  Given a deployed `demo` application
  When checking the status with `oc status`
  Then he/she should see `demo is running`
^a deployed `demo` application$:
  if a developer is absent:
    create a developer
  if cluster not running:
    start cluster
  remove all deployments
  deploy demo application

You can find a comparable explanation here:

jsliacan commented 3 years ago

@mpkorstanje, thanks for your time!

My main question remains unanswered though: Why don't Features have before/after hooks? This question is valid whether Scenarios are independent or not (I'll illustrate on an example below).

Imagine the entire Feature file is one Scenario Outline (independent by design):

Scenario Outline:
    Given that <vegetable> is peeled and chopped
    When you add <vegetable> to the pot
    Then you cook it for <duration> minutes

    Examples:                                                                                                                                                                                                
        | vegetable  |    duration   |
        | carrots    |    5          |                                                                                                                                                          
        | beans      |    10         |

When am I supposed to run the prep code (buy veggies and turn the stove on)? Notice that this only needs to run once and is expensive in terms of time. All Scenarios (independent among themselves) need the same preparation to be done only once (for all of them, not each of them). So the Background and Scenario hooks don't work. I cannot insert a Scenario above the outline, because they wouldn't be independent (as you say they should be). And running preps before each Scenario is wasteful (like you suggest with Background). Running them before Suite impacts Scenarios in other Feature files in unwanted ways.

So what am I missing? I see an important use case being impossible (or awkward at least) without the before/after Feature hooks. I hope I made it clearer now what I'm trying to ask with the synthetic example above: Where should I put the code that needs to be run only once per Feature file (a collection of independent Scenarios)?

Cheers!

gbraad commented 3 years ago

I don't the existence of a developer or an empty cluster are very important

The difference between a developer and administrator would be, so that would be noise. But yes, the 'empty cluster' (deployment) is noise and should be a pre-existence (requirement) for the test to be able to happen, hence that the beforeFeature was considered for this. I'll have a look at your suggestions if this would work to do the deployment in a more generic fashion without adding noise to the scenario.

luke-hill commented 3 years ago

Technically your background in code is this

Turn the stove on, if it is not already on

Which would be translated to

Given I have a pre-heated stove

Then inside your step def / glue code. You would use generic coding techniques to ensure this expensive operation is only ever performed once. So inside your program thread you initialize the class or whatever is expensive, and then memoize the running of the expensive method.

This way it doesn't matter if you run scenarios 1->7, 3->6 or just scenario 5. The stove (for each execution), will be turned on and pre-heated once only.

If Memoization is not your thing, and you will only ever use the single stove (Which seems likely), then maybe Singletons are your thing.

jsliacan commented 3 years ago

@luke-hill fair point - that's a good workaround (simply insert if statement to Background). But as a design, I still don't understand why before/after Feature hooks are removed. I'm going to go through all the links @mpkorstanje listed and I'll see if I can understand your approach and necessity to remove Feature hooks. Feel free to close this issue, as it seems that before/after Feature hooks are not coming back (+ it can be re-opened if need be). Thanks

luke-hill commented 3 years ago

As mentioned previously. The arbiter in cucumber is the scenario. i.e. each scenario/example should decide what "it" does. Just because it is contained inside a feature makes 0 difference. Feature is literally just a wrapper.

If we allow features to have hooks, then where a scenario is nested, will have an impact on what it does. This breaks the TIL (Test Isolation Law - sometimes called Test Isolation Principle), which states that tests should be wholly independent of each other.

l3pp4rd commented 3 years ago

It is also always possible to tag features or individual scenarios, and they can be run by tag. That way you can build your test main the way that it runs groups of scenarios or features by tag.

Though, this would require it to be run separately for each tag filter, which needs a different VM started

jsliacan commented 3 years ago

If we allow features to have hooks, then where a scenario is nested, will have an impact on what it does. This breaks the TIL (Test Isolation Law - sometimes called Test Isolation Principle), which states that tests should be wholly independent of each other.

Yes, this makes sense. At the risk of over-discussing this, let me further refine my question: why was it decided to start applying TIL at the Scenario level and not at a higher level, e.g. Feature level. Clearly, Steps are exempt from TIL (TIL cannot apply to trivial building blocks because the whole test would need to be a single command). But why apply TIL at the very next level? It's very limiting if every independent unit (Scenario) only allows 1 level of expression within itself (Steps). As far as I understand, it's a design choice which level TIL comes in at.

Feature is literally just a wrapper.

Yes, but it wasn't just a wrapper. It used to have before/after hooks etc. It was made into "just a wrapper" recently. And I'm trying to find out why that's such a good idea -- because on the surface it's very constraining. Alternatively, why are there no subScenario or other small units (exempt from TIL)?

luke-hill commented 3 years ago

Whilst not my decision, one of the key reasons (I believe), is that the godog cucumber repo was "added" after the fact. i.e. it already had plenty of code.

Over the last 1-2 years cucumber has mainly been trying to approach a more uniform consistent set of "flavours". That is that a Java variety of cucumber should broadly be similar to Javascript, or Ruby (Where permissible).

The Test Isolation Law operates at a test case level, which in cucumber is a scenario or scenario example. So logically that is where we provide our "hook" or conditional layer.

If you are finding it too restrictive, I would wager that the original tests you have are highly coupled, which again goes against good design practices. I've mentioned above that you could use a hook or a "Given" step to ensure that the stove is switched on, the veggies are all bought e.t.c. One way you could do this is through memoization, another is singletons.

Given we appear to be going round in circles, and that we appear to be discussing something that to all intents and purposes is unlikely to make a re-appearance, is there anything we "can" help with?

jsliacan commented 3 years ago

Thanks for your time @luke-hill - some good clarifications & links in this thread. I agree with moving on:

Feel free to close this issue, as it seems that before/after Feature hooks are not coming back (+ it can be re-opened if need be). Thanks