cucumber / common

A home for issues that are common to multiple cucumber repositories
https://cucumber.io/docs
MIT License
3.36k stars 694 forks source link

Preprocessor standard #773

Open aslakhellesoy opened 4 years ago

aslakhellesoy commented 4 years ago

While we usually recommend putting all the data for a scenario inside of Gherkin documents, there are valid use cases for pulling data in from an external source, such as an Excel file (or other source).

SpecFlow already has support for this

I would like to come up with a specification for a preprocessor API that supports Gherkin, but also Markdown, which we might add support for at some time.

What I have in mind is this:

Document -> Preprocessor -> Preprocessed Document -> Parser -> AST -> Compiler -> Pickles -> Cucumber -> Results

The flow is currently:

Document -> Parser -> AST -> Compiler -> Pickles -> Cucumber -> Results

We could agree on a set of preprocessor directives for each supported input format (Gherkin, Markdown). Open questions:

Syntax

SpecFlow currently uses the format @source:excel-file-path[:sheet-name].

That works, but it wouldn't allow e.g. fetching an Excel document from a URL.

It also overloads the tag syntax, which seems a little confusing to me. I would prefer a dedicated preprocessor syntax (like the C preprocessor). Ideally a syntax that would work for both Gherkin and Markdown

Format

What if we want to pull data in from a CSV or JSON source? It would be nice if users could easily plug in their own preprocessor plugins for parsing the data at the external source. We'd provide a default one for CSV and/or Excel. I think most languages have decent Excel parsers.

Semantics

SpecFlow will merge contents from the Gherkin document and the external Excel document. That seems useful, but it would be great to be specific about what happens if the columns differ in various circumstances.

Let's discuss!

mpkorstanje commented 4 years ago

What would this look like in terms of excel and the feature file? I mean actual pictures. :D

aslakhellesoy commented 4 years ago

Pictures?

mpkorstanje commented 4 years ago

Images. Like a screen shot from excel with the data.

aslakhellesoy commented 4 years ago
Screenshot 2019-10-28 at 16 36 11
mpkorstanje commented 4 years ago

So this is an example table! That makes sense. I thougth the whole feature would be written in excel.

Syntax wise I think something dedicated would be needed: 'Examples from: file://path/to.xls select sheet1$A1:D4`.

This allows selection of the file and data range with in.

aslakhellesoy commented 4 years ago

So this is an example table!

Yes, I realise I may have created some confusion because I have mentioned elsewhere I want to create an Excel->Pickle compiler as well. I still want to do that, but that's something else. It would stand on its own feet and wouldn't have to be "included" in a Gherkin document.

Regarding include directive syntax - I like what you have suggested regarding hyperlinking to a range of cells. I'm not sure if there is a standard for this, but this feels more "Url-like" and "Excel-like" to me:

file://path/to.xls#Sheet1!A1:D4

We could use the conventional C pre processor include directive syntax:

#include file://path/to.xls#Sheet1!A1:D4

Or simpler:

#include to.xls

(This would look up the file relative to the including document, pick the first sheet and all of the cells with values).

aslakhellesoy commented 4 years ago

Reworked the SpecFlow example linked above:

Feature: Calculator

  Scenario Outline: Add two numbers
    Given I have entered <a> into the calculator 
    And I have entered <b> into the calculator 
    When I press add
    Then the result should be <result> on the screen 

    Examples:
#include CalculatorExamples.xlsx
tooky commented 4 years ago

there are valid use cases for pulling data in from an external source, such as an Excel file (or other source

I'm interested in hearing what those use cases are, and whether they still hold if you had Excel -> Pickles?

aslakhellesoy commented 4 years ago

Not sure @tooky - I've started a new ticket to discuss bringing back FIT: https://github.com/cucumber/cucumber/issues/775

mpkorstanje commented 4 years ago

include file://path/to.xls#Sheet1!A1:D4

The fragment has may require some encoding to create a correct url. This doesn't help readability, That's why I'd suggest separating these.

Feature: Calculator

Scenario Outline: Add two numbers Given I have entered \<a> into the calculator And I have entered \<b> into the calculator When I press add Then the result should be on the screen

Examples:

include CalculatorExamples.xlsx

Are you settled on the processor syntax? I can understand that you don't want to add this to the gherkin syntax but this isn't valid gherkin on it's own either so tooling would have to include the pre-processor syntax too to provide proper support.

tooky commented 4 years ago

Rather than some very programmer-looking syntax, could we include a new Gherkin "fixtures" directive that left it up to the programmer to actually pull the data from the source?

Maybe something like:

Feature: Calculator

  Scenario Outline: Add two numbers
    Given I have entered <a> into the calculator
    And I have entered <b> into the calculator
    When I press add
    Then the result should be on the screen

  Examples From: CalculatorExamples.xlsx
Fixture("CalculatorExamples.xlsx") do
  examples = [[a,b],[2,3]] # write code to open and pull data from excel file here
  examples
end

We could provide some support for loading from some predefined fixtures, but it could easily be extended to support almost anything else.

mpkorstanje commented 4 years ago

Tooky you made it look really nice!

But the more I think about it, the more this feels like a step in wrong direction. Why use Cucumber for this? What is the value of using Gherkin when this entire test could be rewritten with half the fuzz using a parameterized test source in JUnit (e.g. see: https://blog.codefx.org/libraries/junit-5-parameterized-tests/#CSV-Sources)

The feature itself is already unreadable so I don't see the benefit of Gherkin here.

aslakhellesoy commented 4 years ago

I think I agree with @mpkorstanje here, and @tooky also questioned the need for this if Excel becomes a first-class citizen as proposed in #775.

Perhaps we should bin this preprocessor proposal and focus on #775 instead?

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs.

stale[bot] commented 4 years ago

This issue has been automatically closed because of inactivity. You can support the Cucumber core team on opencollective.

aslakhellesoy commented 4 years ago

Possible include syntaxes

#include URL
@source:URL[:sheet-name]
Examples: [numbers](URL)
Examples: URL
Examples From: URL

Excel URL syntaxes:

path/to.xls
file://path/to.xls
https://path/to.xls

# anchors
path/to.xls#Sheet1!A1:D4