Scenario (outline) title and placeholders

uuf6429 commented 6 days ago

In this PR, we toyed with the idea of having placeholders in scenario outline titles.

Why did this come up?

Currently, we distinguish between scenarios (aka "examples") of a scenario outline by appending a number to the title. This causes a problem when specific examples are rerun - if you rerun all examples except the first, the first example that is ran will be suffixed with " #1" even though it is actually the 2nd example in the feature file. Conceptually it is actually the first, but in practice, that's not useful to users. See: https://github.com/Behat/Behat/issues/1448

Options

keeping it as is :)

Passing number results in another number #1
Passing number results in another number #2
Passing number results in another number #3

appending the example values instead (was the main scope of that PR, btw):

Passing number results in another number (input: 2, output: 2)
Passing number results in another number (input: 2.4, output: 2)
Passing number results in another number (input: 2.5, output: 3)

replacing placeholders (assume the original title was Passing <input> results in <output>:
```
Passing 2 results in 2
Passing 2.4 results in 2
Passing 2.5 results in 3
```
Concise, readable and simple. :)
like 3, but adding numbers to duplicate titles (see also question 3 below), with title Results in <result>:
```
Results in 2 #1
Results in 2 #2
Results in 3
```
This would/may provide a backward-compatible upgrade path.

Apparently, option 3 is already implemented in cucumber, and has been requested in the past.

Open Questions (for Option 3)

Should we expose placeholder replacement in ExampleNode? As a behat user, I would have found that useful in the past, as a behat contributor, it would have been useful in https://github.com/Behat/Behat/pull/1459.
We have getTitle and getOutlineTitle. What would be the best way to implement placeholder replacement? My current opinion:
- getTitle - this feels like it should return the definitive title, e.g. to be shown in reports and so on. In the ideal world, I think this should be the one replacing placeholders and adding numbers for deduplication.
- getOutlineTitle - Internally (e.g. even in gherkin), it seems that there shouldn't be a concept of a scenario outline - instead they should be flattened into regular scenarios. With that in mind, I think the "outline title" should be the original (parent?) title - since that parent is the "scenario outline".
- getOutlineTitleWithPlaceholders (additional method) - I would go for such a method if we want to be absolutely BC safe. That said, I do feel that the majority of end users would rather have the BC break than the current approach, but that's my hypothesis.
It's important to note that getTitle is already used extensively and changing it might break things.
Not a question per se, but we're considering numbering repeated titles, this would preserve the past behaviour and avoid duplication given titles with missing placeholders.
1. That does beg the question of how an ExampleNode would find that its title has been duplicated elsewhere.

@acoulton On point 2, this approach could work, but I'm not too fond of it:

public function getTitle(?bool $raw = null): string {
    if ($raw === null) {
        @trigger_error('You must specify if you want the raw title or not, in the future this will change to FALSE', E_USER_DEPRECATED);
        $raw = true;
    }

    return $raw
        ? $this->title
        : $this->normaliseTitle($this->title); // normalise replaces placeholders and adds numbering
}

acoulton commented 4 days ago

Just to add a bit more context / background:

Suffixing with index numbers

The numbering of scenarios that @uuf6429 is actually only true for the JUnit formatter in Behat (and perhaps third-party ones).

The progress printer does not display any naming etc for the Scenario Outline / Examples unless it fails. For failing Examples, it produces output like:

.F.F

--- Failed steps:

001 Example: | 1 | 2 |    # features/scenario_outline.feature:10
      Then it should be 3 # features/scenario_outline.feature:6
        It is not 3 (Exception)

002 Example: | 4 | 5 |    # features/scenario_outline.feature:11
      Then it should be 3 # features/scenario_outline.feature:6
        It is not 3 (Exception)

Note that the title / text of the scenario outline is not displayed at all.

The pretty printer always prints the full Scenario Outline and Examples table. If everything passes, this is all rendered green. If examples fail then the failure is rendered inline below the failed row in the table, like so:

Flattening Scenario and Scenario Outline

Although Scenario and Scenario Outline are now synonyms in Gherkin, AFAICS there is still a difference in that the cucumber/gherkin parser still has a concept of a Scenario with examples attached. See this test feature and the AST and pickles it parses to. The AST contains the raw content from the file (e.g. without replacing any placeholders) but the pickles have inlined everything and actually don't seem to even know that they were originally from a Scenario Outline.

In #253 the cucumber/gherkin AST is still being parsed back to our ScenarioOutline and ExampleNode objects - and I assume we'd need to keep doing that if we want to keep having the pretty formatter in Behat render a more concise output for scenarios that have multiple examples with the same steps as it does now.

Behaviour of `ExampleNode::getTitle`

Currently, ExampleNode::getTitle returns the original string content of the table row (e.g. | 1 | 2 |). That feels odd at first, but makes a bit more sense when you consider it also reports its line number as being the line that contains that table row. The cucumber/gherkin AST names that property (for all nodes) text rather than title, which is more logical.

In Behat core:

The progress formatter arguably would be clearer if getTitle returned Passing 2 results in 2 rather than Example: | 2 | 2 |.
The junit formatter could use getTitle direct rather than trying to calculate a title of its own.
The pretty formatter needs to be able to get the original text of the example table row somehow, because it needs to render each set of example values with their original padding so that the columns line up across rows. However, we could provide that as a separate text property.

but we don't know about what any third-party formatters need, or indeed other users of Behat/Gherkin.

Theoretically, we could change the behaviour of getTitle in a new major of Behat/Gherkin. However I suspect there will be extensions / users out there that have only pinned Behat/Behat and haven't considered that their formatter etc is using classes from Behat/Gherkin, who may then have problems if we update Behat 3.x to require Gherkin 5.x. Considering behat has been quiet for a long time, there are likely extensions that would take a while to update - and we'd have to consider whether Behat should require the new Gherkin, or whether we'd update the formatters to support 4.x and 5.x.

acoulton commented 4 days ago

Given all that, I think my proposal would be:

Add ExampleNode::getScenarioTitle() - this would return the parent Scenario Outline title with placeholders replaced. I actually think this naming is quite logical for what it is even if we weren't trying to avoid clashing with the existing names.
Add ExampleNode::getText() - this would return the current title e.g. the text of the table row
Deprecate ExampleNode::getTitle() with a note that users should migrate to getText() or getScenarioTitle() depending on their usecase
Release that as a minor version of Behat/Gherkin - since it's fully BC
Bump the minimum requirement in Behat/Behat
Update the progress and junit formatters in Behat to use getScenarioTitle()
Update the pretty formatter in Behat to use getText()

On numbering, I'd suggest:

Each ExampleNode is created with a new int rowIndex property representing which row of the table it is.
The new ExampleNode::getScenarioTitle() always appends the index even if there are placeholders.

This means a tiny bit more verbosity in cases where the scenario titles are already unique, but saves any complexity of trying to work out whether there are duplicates and how to number them if so.

So:

Scenario Outline: It adds <a> and <b>
  Given I enter "<a> + <b>"
  Then I should get <result>

  Examples:
    | a | b | result |
    | 1 | 2 |      3 |   # getScenarioTitle() => 'It adds 1 and 2 (row #1)', getTitle() = getText() => '| 1 | 2 |      3 |'
    | 4 | 5 |      9 |   # getScenarioTitle() => 'It adds 4 and 5 (row #2)', getTitle() = getText() => '| 4 | 5 |      6 |'

What do others think - @uuf6429 @carlos-granados @ttomdewit?

carlos-granados commented 4 days ago

In general, this looks really good. My only question would be about the getText() function name which is maybe too generic? What about getRowText()?

uuf6429 commented 4 days ago

I'm also unsure about getText(), but also row number or getRowText() in the sense that "row" makes sense within the context of a table, which is just an implementation detail, IMO.

Appending the number in all cases feels a bit not ideal, but it's perhaps the only practical solution.

Keep in mind that the scenario titles will never be unique anyway - a different scenario can always define the same title (in which case the "row number" wouldn't help anyway).

On the whole the proposal looks fine though.

acoulton commented 4 days ago

I'd only suggested getText() to stay close to cucumber/gherkin, although looking at the JSON again I see they actually use name at the scenario/example level, text is for steps. So actually we should pick a name that makes sense to us (as clearly name doesn't work for the | 1 | 2 | value).

Agree that row feels specific to tables, although the concept that Examples are expressed as a table with rows and cells is baked in to the cucumber AST:

So getRowText() could work and has the advantage it's fairly clearly "the text representation of the row in the example table"?

I don't have fixed ideas on naming though.

If we wanted to avoid being table-specific, getExampleText() and exampleIndex might be sufficiently explicit?

carlos-granados commented 4 days ago

Yeah, getExampleText() sounds fine

uuf6429 commented 4 days ago

I'm fine either way, your explanation for row makes sense too.

acoulton commented 4 days ago

Well just because we should make a decision, let's go getExampleText() and exampleIndex - it's not over-long and I think it's quite clear what it is.

stof commented 2 days ago

If the pretty formatter of Behat is a blocker for moving to a pickle-like API (removing the difference between scenarios and outlines at execution time, which also makes it easier to properly handles Rules), maybe we should start a discussion about that formatter in Behat (and look at what cucumber does for its formatting)

acoulton commented 2 days ago

If the pretty formatter of Behat is a blocker for moving to a pickle-like API (removing the difference between scenarios and outlines at execution time, which also makes it easier to properly handles Rules), maybe we should start a discussion about that formatter in Behat (and look at what cucumber does for its formatting)

Not opposed to that, but that feels like it would have to be for a (potentially quite big) Behat 4.x and meanwhile IMO there's a case for a smaller backward-compatible enhancement to allow us to fix Behat/Behat#1448 and potentially other cases where it's currently hard to get a "scenario name" for an example in 3.x.

Behat / Gherkin