cucumber / cucumber-ruby

Cucumber for Ruby. It's amazing!
https://cucumber.io
MIT License
5.18k stars 1.12k forks source link

First-class parallelism #1760

Open mitchgrout opened 3 months ago

mitchgrout commented 3 months ago

🤔 What's the problem you're trying to solve?

I have a suite of tests which take a long time to execute, but have no common resources that would prevent them from running in parallel. While I can use parallel_tests to speed up execution, one irritating limitation is that each process will have its own formatted output. This means I cannot easily see information like total executed/failed steps, the re-run list of failed scenarios, or have a singular HTML report generated. Further, since it randomly assigns features/scenarios to spawned processes, its possible for one or more processes to be stuck with a large number of slow-to-run features, which can lead to sub-optimal test run times.

✨ What's your proposed solution?

A work-stealing parallelism mode, which schedules at scenario-level, and can provide a cohesive output from a single process. This should address the issue of having multiple independent outputs, and sub-optimal splits.

⛏ Have you considered any alternatives or workarounds?

I have had success with parallel_tests and a custom tool I wrote to merge .ndjson data files, which allowed me to continue using the standard HTML reporter. However, this requires some conditional config in my cucumber.yml to ensure .ndjson is emitted when running in parallel, and to suppress all other outputs. I have not yet found a way to resolve the issue of sub-optimal splits.

đź“š Any additional context?

There are a few outstanding questions I'm not sure of, which would limit the possibility of this feature:

  1. How would this interact with the wire protocol plugin? Typically this assumes a singular connection which all tests could be streamed through. Should parallelism be disabled if a wire connection is configured?
  2. How much rework would be required to implement this? Since tests are orchestrated around the event-bus, which as I understand assumes a sequential workflow, how could this be achieved?
  3. If parallelism is implemented, what would this mean for steps which require interactivity such as aruba? Could these steps assert something like not parallel?, or would this not be acceptable?

While I may be misreading the documentation, it appears that the Java implementation may also support some rudimentary parallel test execution.