Closed ajvincent closed 2 years ago
I'd welcome PR that adds parallelization support.
That said: There have already been a couple of failed attempts at adding parallelization. I've learned two main lessons from those attempts. The first is that this area is fertile ground for bugs. The second is that a number of Jasmine's existing features interact with parallel execution in awkward ways. So I'm going to give any new PRs in this area some pretty close scrutiny. I don't get solid blocks of time to do focused work on Jasmine as often as I'd like, so unfortunately "close scrutiny" is another way of saying that review will probably be slow.
Once I get a bit more time to spend on this, I'll follow up with some design questions. You might also want to take a look at Mocha's documentation, which mentions a number of things that don't work in parallel mode. We don't have to make all the same decisions as Mocha, of course. But that's at least a good starting list of trouble spots.
Slow reviews will be fine, as I expect to design this slowly and carefully. Honestly, I want to do it right rather than rush this, and I have plenty to keep me busy. Writing the Jasmine specs for this will itself be very tricky.
I think the next step - critically, before writing any more code - is to determine a design for spec/support/jasmine.json
and for the Configuration interface of jasmine-npm. Thus, everything below will be pseudo-code or API.
My thinking is, at least initially, that we define an optional "parallel" property. In TypeScript:
interface Configuration {
spec_files: string[];
// other properties as they already exist
};
interface TopConfiguration extends Configuration {
parallel?: {
sharedConfiguration?: Configuration;
oneSpecPerRunner: boolean;
// Exactly one of spec_configurations or spec_batches or spec_files
spec_configurations: Configuration[];
spec_batches: string[][];
spec_files: string[];
}
}
Here's how I envision this working:
spec_files
through globFiles()
to get an initial list of specifications.parallel.spec_batches
through globFiles()
to get a list of specification files the batches want to take away from spec_files
.
globFiles()
returning a Set
instead of an array.const elements = Array.from(iterable); elements.sort();
beforeAll()
, beforeEach()
, afterEach()
, and afterAll()
.ChildJasmine
objects, each of which takes a configuration which is:
spec_configurations
OR spec_files
overrides from spec_batches
c. Each ChildJasmine object also has a reference to an AggregateReporter, which will forward to the ConsoleReporter later.beforeAll()
functions run.Promise.all
):
a. Parent beforeEach()
and afterEach()
callbacks are not supported.
b. ChildJasmine calls fork("./child_process.js")
and sends the child process a configuration to run.await jasmine.execute();
d. reporter.replay()
, which sends the ordered set of events (jasmineStarted, suiteStarted, specStarted, specDone, suiteDone, jasmineDone) to the parent process.
e. await processExit.promise
, which is waiting for a command from the parent process to exit.Promise.all(childJasmines)
. As each child reports:
a. A one-character message to the console reporter indicates the reporting (successful or otherwise) from a child process.
b. The parent process sends the exit command to the child process, and the child process exits with code 0.totalSpecsDefined
from each child, then adds in its serial totalSpecsDefined
value, to call reporter.jasmineStarted()
with.jasmineDone()
data.
beforeEach()
, afterEach()
, afterAll()
callbacks run at the parent level.JasmineDoneInfo
data from each child Jasmine process, plus its own serialized Jasmine, into a combined JasmineDoneInfo
which it then feeds to the reporter.Here are some open design questions. I'm working with the following assumptions when I think about these questions:
Parallelization is opt-out in Jest and opt-in in Mocha, previous Jasmine attempts, and your PoC. I think opt-in is the right choice. Not all suites are faster in parallel, and it might be very hard to ship good on-by-default parallel support without some significant breaking changes.
The obvious choice is to assign each spec file to a particular worker, and have each worker load only the assigned spec files. It looks like that's what your PoC does. I'm pretty sure it's what Mocha and Jest do as well. I'm assuming that this is the right choice.
Another option is to have all worker processes load all files, and assign top-level runables (specs or suites) to each worker. That's more complicated and has at least one ugly failure mode (multiple top-level runables with the same name). However, a number of the options discussed below are only workable if every process loads all files.
Jest and Mocha both allow this to be specified on the command line. It looks like the optimal number of workers varies considerably based on the hardware, the details of the test suite, and what other load the machine is under, but is generally not more than the number of CPU cores. It's pretty obvious to me that we need to give users the ability to use different numbers of workers for the same test suite in different situations.
I think we should allow the number of workers to be set in the config file and overridden on the command line. The default is 1, which disables parallelization. This is more or less what Mocha does. Jest defaults to choosing a number of workers based on the number of CPU cores but allows it to be overridden. I think this isn't the best choice: too many people report that their Jest suites run faster if they reduce the number of workers from the default. The growing popularity of CPUs that have a mix of performance and efficiency cores is going to make that kind of guesswork even more questionable in the future.
There are at least three options:
If I understand your PoC correctly, the user manually groups files into buckets and each bucket is allocated to a worker process. I'm not aware of any existing systems that do that. The major drawbacks I see are that users would have to manually rebalance the buckets to maintain good performance, and it wouldn't be easy for users to change the number of workers.
Split files/runables into N buckets at startup time and give each bucket to a worker. This would be easy to implement, and it would allow us to reproduce the same split every time the same random seed is implemented (more on that below). But it doesn't give optimal run time unless files/runables have roughly uniform run time.
Each worker process works on one file/runable at a time. When it finishes, the parent gives it another one. Jest, Mocha, and previous Jasmine attempts all do this. It's a bit more complicated to implement but it should generally be self-balancing, giving a near-optimal distribution of files/runables to workers even when some suites are much slower than others.
Currently, users can reproduce an execution order by specifying a random seed. This is really useful for debugging intermittent failures that depend on order. Obviously there's no practical way to guarantee that two specs in different workers run in a particular order. But ideally we'd ensure that Jasmine distributes files/runables to workers the same way, and runs them in the same order within each worker, every time a random seed is specified.
That said, as mentioned above we probably have to choose between reproducible ordering and optimal distribution of files/runables to workers.
Jasmine is the odd one out here. Mocha doesn't allow users to reproduce a random seed, and Jest doesn't provide randomization at all. (I was very surprised to learn this. I thought both those features were table stakes for any serious testing tool.)
fit
and fdescribe
?If each worker loads all spec files, it should be easy to keep fit
and fdescribe
working. If each worker loads only its assigned files, it's harder. (Probably impossible unless the parent process loads all files before running the workers.)
Ideally we'd make fit
and fdescribe
work properly in parallel mode. But I think it would be fine for an initial release to make them throw errors in parallel mode, like Mocha does.
describe
) beforeEach
/afterEach
Top-level beforeEach
/afterEach
in helper files are extremly common and they need to still work. I'm less clear on what should happen in spec files. My first thought was that top-level beforeEach
/afterEach
in a spec file should be an error when parallel mode is enabled. But jasmine-core doesn't know about the spec/helper file distinction or even the existence of files.
I'm not sure what Mocha does. Its documentation implies that top-level beforeEach
/afterEach
works the same in parallel mode as in sequential mode, but some third-party docs imply that it doesn't work. Jest avoids the problem by not supporting top-level beforeEach
/afterEach
at all.
beforeAll
/afterAll
These currently run once per run of the entire test suite. If we keep doing that in parallel mode, we'll break suites that use beforeAll
/afterAll
to manage in-memory state. If we run them once per worker process, we'll break suites that use them to work with external resources like databases.
Jest avoids this problem by not supporting top-level beforeAll
/afterAll
. Mocha runs those functions once total in sequential mode and once per file in parallel mode. I think that's a mistake. It probably works a lot of the time but I'd expect it to be an unwelcome surprise for some users
Maybe the best thing is to throw an error if there's a top-level beforeAll
or afterAll
in parallel mode.
Note: We need to think about users' custom reporters and third party reporters, not just the ones that come with Jasmine. CC: @UziTech, @bcaudan, and @dfederm.
Existing reporters might assume that reporter events for unrelated suites don't interleave, for example that this order doesn't happen:
That behavior isn't explicitly documented, but it's so obvious and long-established that deviating from it would have to be a semver-major change. I also think that it might be difficult or impossible to update some reporters to deal with interleaved specs/suites. Some of Mocha's built-in reporters don't work in parallel mode. That shouldn't be a problem for Jasmine's built-in reporters, but we would risk leaving third party reporters behind.
If I understand your PoC correctly, it solves this by batching up reporter events until the end and then emitting them in an order that's compatible with existing reporters. That solves the compatibility problem but introduces a couple of new ones. It breaks in environments like CI runners that expect to see output within a certain amount of time. And it makes it a lot harder to identify long-running specs with a reporter like this:
jasmine.addReporter({
specStarted: event => {
console.log(`${event.fullName} started`);
},
specDone: event => {
console.log(`${event.fullName} finished in ${event.duration}ms`);
}
});
That will still work, but you'd have to wait for the entire suite to finish instead of learning the slow spec's identity as soon as it runs.
I don't think batching everything up is workable, but I also don't think we can expect all existing reporters to work with interleaving. What about the following compromise? We allow reporters to declare that they support parallel mode. Jasmine would throw an error if run in parallel mode with a reporter that doesn't declare that support, and then reporter events could interleave.
Besides event ordering, the other thing we need to think about is Env#topSuite
. This method returns metadata for the entire test suite. Will it still work in parallel mode? That's easy to do if every process loads all files but difficult or impossible if each worker only loads certain files. This also potentially affects non-reporter code. I wouldn't personally write a spec that looks at topSuite
, but somebody else might. One option would be to make topSuite
an error in parallel mode and delay releasing parallel support until the next major release. We might also consider having the parent process load all spec files so that topSuite
can at least work there.
Is there a way to make this work in parallel mode? Is it worth the effort? Mocha tries to make it work, but doesn't guarantee anything. We might start by making that option an error in parallel mode, and add something like Mocha's behavior later if there's enough user interest.
I've probably forgot some things, but that should be plenty to think about for now.
That is awesome feedback. I think you care about this more than I do, which is a very good sign, because I care a lot.
Snippets from the above, with my responses.
One key point I should make is that up until now, I hadn't planned on altering jasmine-core at all. Your points about top-level beforeAll()
, beforeEach()
, afterEach()
and afterAll()
may force me to change my mind, but I don't want to do that if I can avoid it.
My proof of concept imports Jasmine as a ES6 module, so we might be able to signal to Jasmine that it should act as a child (disallowing parallel operation, maybe disabling these setup/teardown functions) via a Symbol:
Reflect.defineProperty(Jasmine, "DISABLE_PARALLEL", {
value: Symbol("Disable parallel operations"),
writable: false,
enumerable: true,
configurable: false
});
// ...
config[Jasmine.DISABLE_PARALLEL] = true;
jasmine.loadConfig(config);
Opt-in or opt-out?
Definitely opt-in. I consider opt-out to be a breaking change, especially at the point people upgrade from a forced-serial to forced-parallel Jasmine. The impact from that is absolutely unacceptable.
How is the number of worker processes determined?
I think we should allow the number of workers to be set in the config file and overridden on the command line. The default is 1, which disables parallelization. This is more or less what Mocha does. Jest defaults to choosing a number of workers based on the number of CPU cores but allows it to be overridden. I think this isn't the best choice: too many people report that their Jest suites run faster if they reduce the number of workers from the default. The growing popularity of CPUs that have a mix of performance and efficiency cores is going to make that kind of guesswork even more questionable in the future.
I honestly had not thought about that before, but now that you raise it, it's a very good point.
I'm strong enough with Promises to implement a limited Promise.all() capability - one that generates a queue of ChildJasmine
objects, then starts x number of them at once and only fires off another when a presently running child completes. I've designed that sort of batching before, for a MacOS problem with too many entries into Promise.all()
, so I just need to factor that into my design.
I would make this a command-line -j
option, similar to Make's implementation, if that flag's not already reserved. Adding a "process_count" to the "parallel" dictionary in support.json shouldn't be too hard, and would probably solve a few other points you raise here.
How are files/runables allocated to worker processes?
1. If I understand your PoC correctly, the user manually groups files into buckets and each bucket is allocated to a worker process.
Not exactly, although my design proposal for this allows this capability. I was planning on letting glob
do the work for us, and then making some reasonable decisions from there.
If you want to reject that, that would be the spec_configurations
and spec_batches
options in my design proposal, and oneSpecPerRunner
becomes redundant.
it wouldn't be easy for users to change the number of workers.
Unless I'm missing something, I've answered that in this response.
3. Each worker process works on one file/runable at a time. When it finishes, the parent gives it another one. Jest, Mocha, and previous Jasmine attempts all do this. It's a bit more complicated to implement but it should generally be self-balancing, giving a near-optimal distribution of files/runables to workers even when some suites are much slower than others.
This is what my proof-of-concept does, except that the subprocesses (what you're referring to as workers, but they mean different things in Node) exit automatically upon completion - not accepting new specs.
To what extent will randomization still be reproducible?
My current proposal doesn't factor in randomization across spec files. Since I talk about replacing Promise.all()
, which guarantees no order of execution, just ordering of results, this is something to think about carefully in the design of subprocess batching.
What about
fit
andfdescribe
?
I'm going to need help handling these cases.
Top-level (i.e. not in a
describe
)beforeEach
/afterEach
Top-level
beforeAll
/afterAll
I really need to look at the implementation of these before I try to write tests or implementation..
Maybe the best thing is to throw an error if there's a top-level
beforeAll
orafterAll
in parallel mode.
I suggest the Jasmine.DISABLE_PARALLEL symbol as a flag to check this if we need it. Personally, I think this is the safest option.
What about reporters?
We need to think about users' custom reporters and third party reporters, not just the ones that come with Jasmine.
That's a good point. This means we need to explicitly create - and document - some new reporter classes in jasmine-npm, instead of directly feeding ConsoleReporter as my PoC did. We'll need:
ParentProcessReporter()
with an .addChildReporter()
method and probably a .clearReporters()
methodjasmine.batchedParallelReporter
as an instance of ParentProcessReporter
for batched reportingjasmine.unbatchedParallelReporter
as an instance of ParentProcessReporter
for unbatched reportingConsoleReporter
needs to add itself to jasmine.batchedParallelReporter
.That behavior isn't explicitly documented
It's time to document this. :smile: We're talking about a few paragraphs at most, I presume.
If I understand your PoC correctly, it solves this by batching up reporter events until the end and then emitting them in an order that's compatible with existing reporters.
Yes.
It breaks in environments like CI runners that expect to see output within a certain amount of time.
Can you elaborate on this more, please?
And it makes it a lot harder to identify long-running specs with a reporter like this:
jasmine.addReporter({ specStarted: event => { console.log(`${event.fullName} started`); }, specDone: event => { console.log(`${event.fullName} finished in ${event.duration}ms`); } });
I don't see how. My PoC only alters the jasmineStarted
and jasmineDone
events. Others replay exactly as they come in.
That will still work, but you'd have to wait for the entire suite to finish instead of learning the slow spec's identity as soon as it runs.
I think we can design around this, if we support two messaging reporters from the child process to the parent process: one for the batched report, and one for non-batched reporting. Sort of like stdout
versus stderr
. That non-batched reporter will not be reliable around jasmineStarted
and jasmineDone
, because again there's the spec counts passed through, so this would have to be documented carefully.
The types of messages between child and parent in my PoC are a small fixed set, and adding more types of messages should be pretty easy.
I don't think batching everything up is workable, but I also don't think we can expect all existing reporters to work with interleaving. What about the following compromise? We allow reporters to declare that they support parallel mode. Jasmine would throw an error if run in parallel mode with a reporter that doesn't declare that support, and then reporter events could interleave.
Conceptually, I'm okay with that, but it does mean a change to the Reporter
interface of jasmine-core (argh, I really want to avoid changing core Jasmine in any way).
Besides event ordering, the other thing we need to think about is
Env#topSuite
. This method returns metadata for the entire test suite. Will it still work in parallel mode? That's easy to do if every process loads all files but difficult or impossible if each worker only loads certain files. This also potentially affects non-reporter code. I wouldn't personally write a spec that looks attopSuite
, but somebody else might. One option would be to maketopSuite
an error in parallel mode and delay releasing parallel support until the next major release. We might also consider having the parent process load all spec files so thattopSuite
can at least work there.
I don't suppose we can just deprecate topSuite
... I know absolutely nothing about this, so I need help to figure this angle out. I think this is something we can delay figuring out until later in the implementation, but please call out if you disagree.
Can we clean up the generated page for Suite? It has children
listed four times, id
four times, etc.
The stopOnSpecFailure config option
Is there a way to make this work in parallel mode?
I think so. Again, we could implement this as a priority message from child process to parent, but without ordering of spec files, I can't guarantee which spec it will stop for. More documentation.
Is it worth the effort? Mocha tries to make it work, but doesn't guarantee anything. We might start by making that option an error in parallel mode, and add something like Mocha's behavior later if there's enough user interest.
Following my proposal, that would mean moving stopOnSpecFailure
from the existing Configuration
interface of jasmine-npm to my suggested TopConfiguration
interface. That should be backwards-compatible, I think.
I don't suppose we can just deprecate topSuite... I know absolutely nothing about this, so I need help to figure this angle out. I think this is something we can delay figuring out until later in the implementation, but please call out if you disagree.
I disagree. I think there is a lot of value in reporters knowing which tests exist before they run.
I like the approach of having some parallelizable reporters and some not. There will have to be tradeoffs in the reporter and I don't think every reporter should be forced to support both ways.
Okay, based on @Uzitech's objection, I need to rethink my approach a bit.
I was initially thinking of a straight implementation and testing, but there's perhaps six different phases to implement:
beforeAll()
in the non-parallelized spec filesafterAll()
in the non-parallelized spec filesTo ensure compatibility, I think this means I'll have to write A/B tests, where one Jasmine under test is serialized and one is parallelized - and we're explicitly checking a few properties to make sure they match:
JasmineStartedInfo
JasmineDoneInfo
jasmine.env.topSuite
The documentation for Env.prototype.topSuite
reveals a potential problem. Unlike JasmineStartedInfo
or JasmineDoneInfo
, which are event-driven, topSuite()
as a function returns a Suite
instance directly. If it returned a Promise<Suite>
, we'd have no problem: I could populate the suite asynchronously.
To fully populate topSuite()
before any specs run, we have to load all the specs. I see several options, not mutually exclusive.
topSuite()
, and then exitstopSuite
as a Promise<Suite>
. (This would be an API-breaking change, and thus, requires a major release.)I need to hear what people think on this. I personally prefer the first option. The third option complicates the timing of releases, but might be the right thing to do.
Over the weekend, I was thinking on this quite a bit, and I realized I don't understand exactly what we mean by "parallelizable reporter". More precisely, what a reporter setting that flag expects. "Batched" and "unbatched" are pretty clear names, I think, or maybe "buffered" / "unbuffered".
So I propose, for core, we add a boolean property (defaulting to false) called batched
. When jasmine.addReporter(reporter)
executes, it will look at the batched
setting to decide whether to add it to a replay reporter (which by definition is not batched for the environment) or directly to the environment.
Though I don't foresee any other batching modes than "at end of job", if there is a reasonable possibility of that, we should define batched
to be a string instead of a boolean. I mean, I can imagine a "batched by suite" option, but it's pretty easy to write a forwarding reporter following that model.
There are six traps for each reporter. In unbatched parallel mode, I would say we can't trust jasmineDone()
or jasmineStarted()
to be accurate, since they'll be called several times. A batching reporter won't be better, but the parent aggregation reporter in jasmine-npm (which is batched) will clean that up as I've already demonstrated, so we can have batched reporters attach to the aggregation reporter instead.
One of the facets which caught me by surprise in my proof-of-concept implementation was that some reporters didn't implement all the methods of the Reporter interface. I think Jasmine core should provide a stub reporter base class where the methods are no-ops. I'm not locked in on this position, though.
Alternatively, I could provide a safe reporter class which checks for the existence of each method of a target reporter before we call it. I'm going to need that anyway, but the real question is should I expose that class publicly.
id
propertiesDigging in a little further, I now see another stumbling block: Suite.prototype.id
and Spec.prototype.id
might not be unique in a child-process situation as currently written. getNextSpecId()
and getNextSuiteId()
in Env.js (or their accompanying private variables) need to be altered in the parallelizing world. Jasmine's documentation makes no mention of the specific shape of the id beyond "it's a string", so I must ask how much freedom I have to change that too.
(This is definitely a bug in my proof-of-concept code.)
Testing uniqueness of id's should be easy: add each id to a Set and compare the size of the Set to the number of specs.
See the above.
By themselves, batched reporting and id strings are additions and prerequisites for parallel reporting. I think we can add them as minor changes to the core, particularly since they don't themselves require other features.
That said, I'd very much like to get them baking in a minor version release before the parallel running code of jasmine-npm.
As I mention above, changing the API for topSuite
would be a breaking change, requiring a major release anyway.
To keep momentum going, I'll need some decisions on the jasmine-core strategy soon. I expect not to make comments or start writing code until next weekend, to give others time to respond to what everyone's said so far, but I want to get jasmine-core designs approved and changes into the pipeline first.
I'll try to find time to respond to the various comments in the next couple of days.
FWIW, I don't think we've come to a consensus on a number of key high level design questions. It's probably going to take a bit more back-and-forth to get there. Of course you're free to start coding, but that's likely to create a situation where you have to choose between doing significant rework and having your PR declined. Either of those outcomes would make me sad. But so would shipping the wrong behavior and being stuck with it for years, or even forever.
I appreciate your enthusiasm and your concern about momentum. But we're talking about a major change to a piece of software that's 13 years old, gets downloaded millions of times a week, and is maintained by volunteers in their free time. Things are just not going to happen as fast as you want them to.
Good catch on the ID collision. I hadn't thought of that.
FWIW, I don't think we've come to a consensus on a number of key high level design questions. It's probably going to take a bit more back-and-forth to get there. Of course you're free to start coding, but that's likely to create a situation where you have to choose between doing significant rework and having your PR declined. Either of those outcomes would make me sad. But so would shipping the wrong behavior and being stuck with it for years, or even forever.
I appreciate your enthusiasm and your concern about momentum. But we're talking about a major change to a piece of software that's 13 years old, gets downloaded millions of times a week, and is maintained by volunteers in their free time. Things are just not going to happen as fast as you want them to.
You're right. I guess I got a sense that this was moving faster than it really was. For what it's worth, I haven't written one line of code and I don't plan on doing so until I get some actual design approvals.
Here's my 2¢.
Opt-in, at least initially until we know everything works well. Maybe in a future release it could be changed to opt-out.
Load assigned ones.
Default to num cpus / 2 but configurable.
workers grab files when they are idle.
Randomization for tests in a single file. seed determines order. Possibly randomize file order as well. File order won't be reproducible if different workers can grab them.
fit
and fdescribe
?All tests loaded initially, then determine which files to load based on fit
and fdescribe
.
or
All tests loaded initially, then run without parallelization if fit
or fdescribe
exist.
beforeEach
/afterEach
Throw error. Some things just don't make sense in parallelized tests. There shouldn't be any reason they can't create a beforeEach/afterEach in each describe and import a common function.
beforeAll
/afterAll
Throw error in spec files. Allow in helpers to be run on the main thread before/after all tests.
Batch calls after each file is complete.
I would rather the workers report back at the start and end of every test but I feel like that would probably be a lot slower.
stopOnSpecFailure
config optionSend stop signal to all workers. Workers finish current test (including afterEach
/afterAll
) then stop.
Pseudocode:
// only evaluate files with focused tests if any exist
topSuite = loadTests()
files = filesList()
if (hasFocusedTests()) {
files = onlyFilesThatHaveFocusedTests()
}
if (config.numWorkers <= 1) {
return runTests()
}
workers = startWorkers()
for (const worker of workers) {
// stop all workers on failure if stopOnSpecFailure
worker.onMessage('failure', () => {
if (config.stopOnSpecFailure) {
workers.forEach(w => w.sendMessage('stop'))
}
});
// when worker is done the main thread sends the batched calls to
// the reporter and send it the next file to evaluate
worker.onMessage('done', (metadata) => {
sendNextFileTo(worker)
sendCallsToReporter(metadata)
});
}
Rearranging things a bit:
These are all interrelated, and what we choose here could have major implications for the rest of the system.
Spec/suite IDs allow reporters to link reporter events with spec/suite objects accessed through topSuite. For that to work, the IDs don't just have to be unique. They also have to be consistent. In other words, a reporter event originating in a child process must have the same ID as the corresponding spec/suite object in the parent process. I don't see a reasonable way to do that without making every process load all files, which I don't think is a good idea.
On the other hand, if topSuite isn't supported in parallel mode, reporters would only need the IDs to be unique. topSuite is useful but pretty niche. So while we can't remove it, I think it would be OK for it to work only in non-parallel mode.
I don't want to do anything that would break existing reporters in non-parallel mode. That includes making reporter methods required, requiring reporters to inherit from a base class, or assuming that reporters can handle looser event ordering guarantees. But I also don't think we need to go out of our way to make existing reporters work in parallel mode without changes.
I think it'll be OK to emit reporter events as they happen rather than batching them, with the exception of jasmineStarted and jasmineDone. I hear @UziTech's concern about performance but I suspect it won't matter. I'd rather not take on the extra complexity unless we can measure a noticeable speed difference. Batching also has the potential to introduce hard-to-reproduce bugs in some cases, like if the last file to run takes longer to run than the CI system's no-output timeout. Avoiding hard-to-reproduce crash bugs is an extremely high priority for me as a maintainer.
So here's what I'm thinking:
reporterCapabilities: { parallel: true }
. (@ajvincent I hear your objection to this kind of naming, but I disagree. The flag should be named after the feature it enables, not how we chose to implement one aspect of that feature. We can use documentation to tell reporter authors how to support parallel mode.)A note on implementation: We shouldn't write new code to deal with reporters that don't implement all methods. ReportDispatcher
already solves that problem. It also encapsulates a decade's worth of hard-learned lessons on how to safely call through buggy user-provided async code. Yes, it's a -core internal, but I'd rather use it from -npm than reinvent that particular wheel.
These should throw exceptions in parallel mode. I don't see any reasonable alternative. If we run them in each process, suites that use them to set up external state (e.g. database) will break. If we only run them in the parent process, suites that use them to set up in-memory state will break.
Top-level beforeEach/afterEach in helper files has to work in parallel mode. It's an extremely common and useful pattern, and omitting it will prevent too many people from using parallel mode. Jasmine's own test suite relies on it. Not being able to run Jasmine's tests in parallel mode would mke it much harder for me to test the feature.
On the other hand, I'm fine with breaking top-level beforeEach/afterEach in spec files when run in parallel mode. And I don't see an alternative. It would be nice to report an error in that case, but I'd also be ok with just documenting that it doesn't work reliably.
These can throw in parallel mode. They're mainly for debugging, and I don't think anyone really needs to debug and run in parallel mode at the same time. I'm not opposed to making them work in parallel mode but we shouldn't add any significant complexity to make that happen.
@UziTech:
Randomization for tests in a single file. seed determines order. Possibly randomize file order as well. File order won't be reproducible if different workers can grab them.
Agreed. That's probably the best we can do.
It sounds like we're converging on best-effort support for that option, like Mocha does. I'm fine with that. I'd also be fine with making it an error in parallel mode, and adding support later if there's enough user interest.
Mocha and Jest have no parallel-specific configuration other than the number of process. I think they got this right and we should follow their lead. I think that's going to be better for most users than the PoC's more complex configuration.
Mocha and Jest create a fixed number of processes and allocate files to them. @ajvincent's PoC creates one process per file. I think Mocha and Jest have this right and we should follow their lead.
There are two problems with the process-per-file model. The first is memory consumption. If many spec files load big module graphs (e.g. Material UI) there may not be enough memory to have that many copies. The second is time overhead. On my Windows machine, the median top-level suite in jasmine-core takes 47ms. But it costs ~70ms just to fork and exit a Node process that doesn't load any code. These aren't just edge cases. In my experience, many real-world codebases have both a bunch of heavyweight suites and a long tail of fast suites. We need to handle both well.
It sounds like we're all agreed on the idea of distributing files, not suites/specs, to child processes.
@UziTech:
workers grab files when they are idle.
That's what I'm leaning toward as well. It seems like it would provide the best performance. It probably adds some state management complexity to Env
though. When each new file arrives, the child process's Env
would need to make sure that it runs everything defined in that file, along with beforeEach
/afterEach
fns defined in helpers, but not run any specs or suites defined in previous spec files. I don't think that would be too hard but I haven't thought about it very deeply yet.
I'm neutral-ish on whether the parent process should load spec files. Does that buy us any capabilities that don't also require us to have consistent IDs across processes?
As for handling core changes: I'm really reluctant to merge parallel support code into core before the feature as a whole is ready. In my experience code that isn't used yet tends to be hard to review, hard to maintain, buggy, and often needs a redesign anyway when its eventual caller shows up. If you have specific refactorings that will make things better even if we never ship parallel support, I'm happy to consider those on a case by case basis. If you want to have core changes reviewed piecmeal and landed on a branch, I can also do that. But I'll also give the whole thing a review when it's ready to go and working.
A lot to digest there, and I'll need to read through it a few times before I compose a response.
Regarding the consistent ID problem, I think we just need a hash function, say:
:suite(${suiteCounter++})
or :spec(${specCounter++})
(where suiteCounter or specCounter are stored in a Map<filePath, counters>
)The colon character is not valid base64 encoding, so it'll work as a delimiter if we need to reverse it.
The part of Jasmine that generates IDs doesn't know anything about files, and can't even assume that suites are defined in files. A lot of Node-centric assumptions break down as soon as you cross into -core.
... okay, what if we passed in an id prefix from the loader into the core?
Top level beforeAll/afterAll
These should throw exceptions in parallel mode. I don't see any reasonable alternative. If we run them in each process, suites that use them to set up external state (e.g. database) will break. If we only run them in the parent process, suites that use them to set up in-memory state will break.
I think we need a way to run something before and after all tests (e.g. to setup/teardown a database). The once per process can be done inside a describe.
I think we need a way to run something before and after all tests (e.g. to setup/teardown a database).
That's probably the motivation behind Mocha's global fixtures. I'm not sure how closely I'd want to copy that design, but I think the idea of having a clearly separate construct for exactly-once setup and teardown has merit.
A lot of Node-centric assumptions break down as soon as you cross into core.
If we need to keep core environment agnostic, will we be able to launch the workers from core? Or will we need to run core in workers? It seems like jest and mocha in parallel mode assume a nodejs environment.
If we need to keep core environment agnostic, will we be able to launch the workers from core? Or will we need to run core in workers? It seems like jest and mocha in parallel mode assume a nodejs environment.
Core definitely needs to run in each worker, since everything that spec files expect is implemented in core. I could see the coordinator code in the parent process being done in either of two different ways:
The first option is what the PoC does, and it's the first thing I'd try if I was doing this myself.
I've been tied up the last couple weeks, but this has been on my mind.
I plan on preserving that as a non-parallel option only. My intent was to load the non-parallel spec files first (which would trigger beforeAll()), then fire off the parallel runners, and only when they're all finished, run the non-parallel specs (which would eventually trigger afterAll()). Of course, this assumes incorrectly that I won't have to tweak the run algorithms, but the intent is the same.
Keeping a specific number of parallel subprocesses alive as runners, I think, means another important change to Jasmine core: incremental spec loading inside a subprocess. That is:
This is my first impression on how I'd do this.
I think so, but I'm not sure if we're resolving these questions. @sgravrock, your thoughts?
@ajvincent I just realized that you've been talking about non-parallel spec files in your last few messages, and I didn't pick up on it. What is a non-parallel spec file? What use case do they serve? Do we need them, given that Mocha and Jest have been successful with an all-or-nothing parallelization scheme?
My intent was to load the non-parallel spec files first (which would trigger beforeAll()), then fire off the parallel runners, and only when they're all finished, run the non-parallel specs (which would eventually trigger afterAll()).
I think allowing any form of top level beforeAll/afterAll in parallel mode is a mistake. Either choice (running them once in the parent process or running them in each child) is going to be confusingly wrong for a lot of users. We might be better off with something like Mocha's design where there is a completely separate way to specify exactly-once operations in parallel mode. But this isn't something we need to solve right now: I could work on it once we have working code in place for the rest of parallel mode. @UziTech has persuaded me that we need some sort of exactly-once setup and teardown, but we might actually be better off figuring out the details later.
The latter is probably best done via recreating the child reporter, but I don't know.
I don't think we need to do that. The report dispatcher (somewhat confusingly called reporter
in some places) is already stateless except for the list of reporters it dispatches to. My expectation is that in parallel mode, the only configured reporter on the child will the one that forwards events to the parent. That can probably be stateless too, and if not it should be a simple matter to reset it.
User-provided reporters are normally configured through Jasmine#addReporter, which will have the effect of attaching them only to the parent's Env
. So we shouldn't have to worry about them in the child.
How hard is it for jasmine-npm to completely destroy any vestiges of jasmine-core state?
Easy (just create a new Env
), but we can't do it because it would break top-level beforeEach
and afterEach
.
Are there shared services? Are we going to run into a shared state problem which may have killed the last attempt to do parallelization?
There were a number of problems that killed the last attempt. You're right that shared state was one of them. I think/hope that #1934 already takes care of most of that. I wouldn't be surprised if it's also necessary to remove children of the top suite in between batches. We might also need to do something to either preserve the random seed between batches or reset it, depending on which seems like the better behavior.
I've done a bunch of refactoring recently to split Env
into smaller classes. I hope that makes the state management somewhat easier to follow than it was before.
Is this where worker threads come into play, where each thread can load Jasmine as an independent copy?
I don't see why we need worker threads when we're already creating processes.
Are we approaching consensus on a design?
I think so, but I'm not sure if we're resolving these questions. @sgravrock, your thoughts?
I hope so. Here's what I'm thinking about the key decisions:
Combining the next two quotes together:
What is a non-parallel spec file? What use case do they serve?
...
@UziTech has persuaded me that we need some sort of exactly-once setup and teardown
This is one of two primary use cases for non-parallel spec files. The other is for specs the user - for whatever reason - didn't want to run in parallel mode. I will refer you again to https://github.com/jasmine/jasmine-npm/issues/195#issuecomment-1142560241 , specifically the very first section listing a six-step algorithm.
I think allowing any form of top level beforeAll/afterAll in parallel mode is a mistake.
I have agreed all along. Emphasis in parallel mode. If a particular spec file in spec/support.json
is not brought into a parallel configuration as an opt-in, it stays non-parallel, and we allow beforeAll()
/ afterAll()
.
This is why another of my comments, https://github.com/jasmine/jasmine-npm/issues/195#issuecomment-1135374456 , referred to TopConfiguration
as separate (and extending) from Configuration
.
How hard is it for jasmine-npm to completely destroy any vestiges of jasmine-core state?
Easy (just create a new
Env
), but we can't do it because it would break top-levelbeforeEach
andafterEach
.
If that's what happens, and we want to have no more than -j child processes running, that might be exactly what we need. A child process, under the latest discussion, takes exactly one spec file, loads it, runs it, and then clears the environment for the next spec file to load.
You've stated previously:
On the other hand, I'm fine with breaking top-level beforeEach/afterEach in spec files when run in parallel mode.
I hope so. Here's what I'm thinking about the key decisions:
- Top-level beforeEach/afterEach: Has to work in parallel mode.
I think I've missed something critical here. There's top-level in a spec file loaded in parallel mode (meaning "files loaded as parallelized spec files") and there's top-level in a spec file loaded in non-parallel spec files (meaning "files loaded the current way, at the top level"). Both of these have (as potentially illegal - I'll get to that in a moment) beforeAll()
, beforeEach()
, afterEach()
and afterAll()
as top-level functions.
We probably need to lay this out in a Markdown table to illustrate:
Top-level Function | In non-parallel spec file | In parallel spec file |
---|---|---|
beforeAll() | Global setup | Throws exception |
beforeEach(), afterEach() | Applies to non-parallel spec files | Applies to only that spec file |
Unclear if they apply to parallel files | ||
If they do, how do we apply them? | ||
afterAll() | Global teardown | Throws exception |
As I've said before, I envisioned non-parallel spec files as your global setup / teardown mechanism.
I think top-level beforeEach()
and afterEach()
in a parallel-loaded spec file should only apply to that spec file. The only reason this wouldn't be a breaking change is that we haven't had parallel specs before.
Please feel free to copy and modify the above table to clarify your thinking.
I think, if we really want to share beforeEach()
and afterEach()
definitions across parallel spec files, or from non-parallel spec files to parallel ones, we would want to encourage users to leverage CommonJS's require()
or ECMAScript's import
statements to do so. Suggest they break out the shared beforeEach()
/ afterEach()
functions into files they can import.
given that Mocha and Jest have been successful with an all-or-nothing parallelization scheme?
I agree we should do all or nothing. If someone wants some tests to be parallel and some not they can just run jasmine twice one time in parallel mode and one time without. Mixing them would be very difficult and result in some things working differently than they would otherwise.
if we really want to share beforeEach() and afterEach() definitions across parallel spec files
I really don't think top-level beforeEach() and afterEach() in spec files in parallel mode is feasible. We would have to load every file in every process just to check if there are any top-level functions. I think top-level functions should only be allowed in helper files in parallel mode. We can load every helper file in every process.
Overall execution model: Jasmine creates the specified number of child processes and allocates files to them. Helper files are loaded in every child process. Each spec file is only loaded in the child process that will run it.
Is this jasmine-npm? I think we really need to differentiate jasmine-npm with jasmine-core. I think there is some confusion when saying "jasmine".
This means running jasmine in the browser would not allow parallel mode?
I agree we should do all or nothing. If someone wants some tests to be parallel and some not they can just run jasmine twice one time in parallel mode and one time without. Mixing them would be very difficult and result in some things working differently than they would otherwise.
(Update: This is a really good point - running jasmine-npm twice - one I hadn't considered fully. I'll buy into that. If someone really wants it to appear like a combined session, they can easily write a bootstrapping reporter like I did in my PoC.)
How would we do global setup and teardown then? That's the part I don't have an answer for, except by some other change which states in the top configuration "these files jasmine-npm must execute before any parallel specs run".
(Note: I'm not wedded to the idea of a "top configuration" versus a "child configuration". That was an early thought experiment which I'm willing to ditch for consensus.)
I'm not opposed, exactly, but we have to support global setup / teardown somehow. Allowing some files to be treated as non-parallel spec files seemed like the simplest option.
Please, do propose a schema change for a parallel-only jasmine-npm configuration which would achieve this. I'm open to suggestions.
if we really want to share beforeEach() and afterEach() definitions across parallel spec files
I really don't think top-level beforeEach() and afterEach() in spec files in parallel mode is feasible. We would have to load every file in every process just to check if there are any top-level functions. I think top-level functions should only be allowed in helper files in parallel mode. We can load every helper file in every process.
Honestly, I'm okay with this for forward compatibility: throwing exceptions in an earlier version for behavior that a later version may choose to support is a safe way to go. We can rule it out for now and allow some future implementation to implement it later.
This would change my above table:
Top-level Function | In non-parallel spec file | In parallel spec file |
---|---|---|
beforeAll() | Global setup | Throws exception |
beforeEach(), afterEach() | Applies to non-parallel spec files | Throws exception |
No impact on parallel spec files | ||
afterAll() | Global teardown | Throws exception |
This means running jasmine in the browser would not allow parallel mode?
Right now I see jasmine-browser-runner as out of scope for this feature. Again, we can rough this out here and figure out how to add support there later.
Given @sgravrock 's answer that we are approaching consensus (and I agree - we're fiddling over top-level, which is an important detail but the only one of contention), I think it's about time to formally define the schemas we intend jasmine-npm to use, both for spec/support.json
and for the internal configuration object, to support parallelization. Consider my initial TopConfiguration
and Configuration
definitions to be throw-away code.
Question number 1 for the jasmine-npm schema is "How do we tell Jasmine it should run some (or all) spec files in parallel mode?"
Again, I'm not writing code yet.
Side note for discussion: we're going to have at least three different reporters in jasmine-npm/lib/reporters
: the existing console reporter, an aggregator reporter and a child process reporter.
I'm wondering if there are other reporter classes "in the wild" we might want to add to jasmine-npm
as part of the package? (Since we're targeting 5.0 anyway.) I'd like to see a separate issue for nominations.
Given @sgravrock 's answer that we are approaching consensus (and I agree - we're fiddling over top-level, which is an important detail but the only one of contention) [...]
I said no such thing. What I said was "I hope so". I then outlined my current thinking. You responded by disagreeing on at least two key points, including pushing for a breaking change that I'd previously said was out of the question. I don't understand how you get from there to claiming, with italics even, that I'd said we were approaching consensus.
We've been talking past each other quite a bit in this thread. We have different goals, priorities, and ways of approaching software design. You've been slow to accept feedback that doesn't fit with your plans and too quick to propose breaking existing user code. I'm willing to own my share of the communication gap, and I was willing to work through all of that to help you make your contribution.
But I draw the line at obvious false claims about what I said. I don't know if you're trying to manipulate me or if you truly heard "yes" when I said "I hope so". And I don't really care. Either of those is fatal to our collaboration.
I'm not going to work with you any more. I'm sorry. I really hoped this would work out, but it didn't.
I thank you for the earlier part of this discussion. It's done a lot to firm up my understanding of how parallel support should work from the user's perspective. It's also taught me that adding parallel support will take even deeper knowledge of even more of Jasmine than I originally thought. I don't think this is a good feature for most contributors to take on. I also don't think I have the time or energy to guide someone through it at a reasonable pace. My plan is to try to build it myself sometime between now and 5.0.
Parallel support is now available in 5.0.0-alpha.0. See https://jasmine.github.io/tutorials/running_specs_in_parallel for details.
Proof of concept: https://github.com/ajvincent/composite-collection/tree/main/build/jasmine-parallel
Each jasmine-npm runner runs as a child process, so they don't have shared memory. Using this code, I found my Jasmine specs ran 80% faster, cutting about 4 minutes from my
npm run bootstrap
process. I'm currently running over 1000 specs in two passes each, so this is a significant win.I'm willing to do the work to add this. I know some work would have to go into
spec/support.json
to define specs that can run parallel to each other. I also know that sometimes there are potentially resources that must be executed before all parallel specs and after all parallel specs, and that some specs may still want to run sequentially after the parallel run.