Open tgsmith61591 opened 4 months ago
@tgsmith61591
- How is concurrency actually affecting the runtime? I was under the impression concurrency was at the
scenario
level, but now I'm wondering whether it's actually at the feature level
concurrency
is the maximum number of scenarios running concurrently (in async
manner).
- Is there any guidance you can give on tuning
concurrency
to the number of feature files?- Is there any further guidance on how tests should be broken up to get the best performance out of the
cucumber-rs
engine?
Actually, you shouldn't. The behavior you observe, that splitting source files leads to a significant performance gain, seems to be buggy, weird and unexpected. There shouldn't be any significant difference. Should be investigated and fixed.
The idea we have in mind for now is that a Parser
returns a Stream
of features being consumed by a Runner
, executing them concurrently on scenario level. Seems like the current Runner
implementation breaks up those features into scenarios in some weird manner, affecting the performance.
@ilslv do you have any suggestions on this?
@tgsmith61591 can you share a little bit more about the characteristics of the testing suite? Is it async or sync heavy, or maybe appropriately both? Can you share the World
setup: number of concurrency
and other options?
Hey @ilslv and @tyranron, the test suite is very async
heavy. While I cannot share the world setup (highly complex, and belongs to the company, not me) I can share a small repo I set up to reproduce this issue.
-c 32
-c 32
(!!!)Thank you for the reproduction repo! I'll definitely take a look hopefully this weekend.
Hey @ilslv, any chance you got a chance to look at this?
not yet, unfortunately 😢
Any update on this issue @ilslv ?
We might be able to take a look into this if you could provide any pointers @ilslv ? We have 1000 tests or more that currently eat up around 40 minutes instead of around 5 minutes!
We've come up with a simple workaround using a wrapper around the basic parser that splits each scenario into its own gherkin feature:
#[derive(Debug, Default)]
struct SingletonParser {
basic: cucumber::parser::Basic,
}
impl<I: AsRef<Path>> cucumber::Parser<I> for SingletonParser {
type Cli = <cucumber::parser::Basic as cucumber::Parser<I>>::Cli;
type Output = stream::FlatMap<
stream::Iter<std::vec::IntoIter<Result<Feature, cucumber::parser::Error>>>,
Either<
stream::Iter<std::vec::IntoIter<Result<Feature, cucumber::parser::Error>>>,
stream::Iter<iter::Once<Result<Feature, cucumber::parser::Error>>>,
>,
fn(
Result<Feature, cucumber::parser::Error>,
) -> Either<
stream::Iter<std::vec::IntoIter<Result<Feature, cucumber::parser::Error>>>,
stream::Iter<iter::Once<Result<Feature, cucumber::parser::Error>>>,
>,
>;
fn parse(self, input: I, cli: Self::Cli) -> Self::Output {
self.basic.parse(input, cli).flat_map(|res| match res {
Ok(mut feature) => {
let scenarios = mem::take(&mut feature.scenarios);
let singleton_features = scenarios
.into_iter()
.map(|scenario| {
Ok(Feature {
name: feature.name.clone() + " :: " + &scenario.name,
scenarios: vec![scenario],
..feature.clone()
})
})
.collect_vec();
Either::Left(stream::iter(singleton_features))
}
Err(err) => Either::Right(stream::iter(iter::once(Err(err)))),
})
}
}
Before:
[Summary]
1 feature
702 scenarios (702 passed)
41957 steps (41957 passed)
test test ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2013.41s
After:
[Summary]
702 features
702 scenarios (702 passed)
41957 steps (41957 passed)
test test ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 165.86s
We maintain a large collection of feature files that feed nightly regression tests, the runtime of which has grown significantly recently. These are generally maintained in logically separated feature files, and leverage
Scenario Outline
tables, sometimes with 100-200 scenarios per feature file.In experimenting with optimizations on a single tag, I tried splitting a feature file into 4 and observed a massive performance gain. Here is my baseline:
Here are the same exact tests divided over 4 feature files:
For reference, here is how we're running (note that
local_test_jobs
keeps Bazel from trying to do its own parallelism, instead delegating the concurrency to the cucumber engine):Several questions I have after observing this major performance difference:
scenario
level, but now I'm wondering whether it's actually at the feature levelconcurrency
to the number of feature files?cucumber-rs
engine?