golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
121.46k stars 17.4k forks source link

proposal: cmd/go: add an option to sequentially run tests #61318

Open howardjohn opened 1 year ago

howardjohn commented 1 year ago

It is a common pattern to want to run Go tests in sequence rather than in parallel. Within a package, this is controlled by t.Parallel() and -test.parallel. However, across packages there is a different control: -p N (typically -p 1 for this use case).

This is commonly used for integration tests which are depending on exclusive access to some resource outside of the test.

However, this comes at a huge cost: -p 1 does not control just test parallelism, but all compiler actions. This means all compiling and linking is done sequentially. In our project, this has caused a 10x increase in test execution times - with -p 1, it takes 8m25s; without, 50s.

I would like a way to run test packages sequentially, while still benefiting from the parallelism in building.

Ecosystem usage

I did a rough analysis of usage of go test -p 1 across Github, and found a number of large projects doing the same. I use stars just as a rough measure of the project scope.

Code Link Stars
https://github.com/ory/hydra/blob/425c977a3aa4e519762582e0be4b1adf5043f383/.docker/Dockerfile-hsm#L33 14k
https://github.com/letsencrypt/boulder/blob/b090ffbd2ea01407ad6e580a9cea78db6548d942/README.md?plain=1#L187 4.7k
https://github.com/mongodb/mongo-go-driver/blob/f8b88fc241a91c32455f61c57557921ba7d17d2d/Makefile#L106 7.5k
https://github.com/netlify/gotrue/blob/3fabd3fc5a6fb51eb54cd88831a380a4c18a2eff/Makefile#L30 3.5k
https://github.com/netlify/gotrue/blob/3fabd3fc5a6fb51eb54cd88831a380a4c18a2eff/Makefile#L30 4.2k
https://github.com/atomix/atomix/blob/6decbb2531c39254979ff1f10e65e00c6e1dca99/controller/Makefile#L26 2.3k
https://github.com/Shopify/ghostferry/blob/3dc7ace25fad9c0e9c5c5f68c8de70af2bd08d46/Makefile#L66 600
https://github.com/vulcand/vulcand/blob/16e7d32893f7629e1e9b8fae8a0d9023b7cd338f/Makefile#L11 3k
https://github.com/gliderlabs/logspout/blob/818dd8260e52d2c148280d86170bdf5267b5c637/Makefile#L65 4.6k
https://github.com/istio/istio/blob/1cc44eed185626832487f56d74b0f18284d52b63/tests/integration/tests.mk#L84 33.3k
https://github.com/nuclio/nuclio/blob/ef5de182e9a22ebd48b68ad043a423bdf63cac90/Makefile#L689 4.9k
https://github.com/ory/kratos/blob/41b7c51c1c6b3bdff9e9ea8bb5e455e3c15c5256/Makefile#L79 8.9k

If you search "go run tests sequentually", basically all sources will tell you to use -p 1. This includes AI chatbots, popular stack overflow answers, blogs, and even some Go core maintainer recommendations.

Alternatives

ianlancetaylor commented 1 year ago

CC @bcmills @matloob

rsc commented 1 year ago

To clarify, you are looking for a way to run each package test one at a time? What about the tests inside a given package that are using t.Parallel? Do you want to disable that parallelism too?

howardjohn commented 1 year ago

To clarify, you are looking for a way to run each package test one at a time?

Yes, only one package should execute at a time during go test -somenewflag ./.... Similar to go test -p 1 ./..., but without the side effect of compilation also being serialized.

What about the tests inside a given package that are using t.Parallel? Do you want to disable that parallelism too?

I think keeping inner-package parallelism is fine and could be controlled with the existing -test.parallel

bcmills commented 1 year ago

Put synchronization into the test binary itself. This is challenging and ineffective. Things like mutex cannot be used as it is cross-binary, so we need an external locking system.

Compare #33974, which would add a lockedfile.Mutex that could be used by such tests.

JeremyLoy commented 2 months ago

At work, we have a mix of NodeJS and Golang services. Those services communicated with a database, and as such have tests that exercise queries against a Dockerized database. As that is a shared resource, tests that hit the database must be serialized, while tests that operate purely in memory are allowed to be parallelized.

The NodeJS services have a pretty easy way of solving this using a custom Jest runner, which I'll link below. Our way of doing this in Golang is by putting all of the tests that use an external resource (like a database) in a singular package so that they run sequentially. This causes the test files to be located not alongside the source files though, which is a shame.

In the NodeJS way, the stream of tests that must be sequential is defined by filename. That isn't very robust and is error prone.

I think there should be a way to do this in Go using a testing.T method, something like t.Mutex("aKeyToIdentifyResource"). That way each test is self contained, and the package/file/folder organization isn't impacted by which tests need exclusive access to a particular resource. Multiple distinct resources can exist as well.

import { Config } from "@jest/types";
import TestRunner, { Test, TestEvents, TestRunnerContext, TestWatcher, UnsubscribeFn } from "jest-runner";

export default class CustomTestRunner {
    public supportsEventEmitters = true;

    private parallelRunner: ParallelTestRunner;
    private sequentialRunner: SequentialTestRunner;

    constructor(globalConfig: Config.GlobalConfig, context: TestRunnerContext) {
        this.parallelRunner = new ParallelTestRunner(globalConfig, context);
        this.sequentialRunner = new SequentialTestRunner({ ...globalConfig, maxWorkers: 1 }, context);
    }

    on<Name extends keyof TestEvents>(
        eventName: Name,
        listener: (eventData: TestEvents[Name]) => void | Promise<void>,
    ): UnsubscribeFn {
        const parallelUnsubscribe = this.parallelRunner.on(eventName, listener);
        const sequentialUnsubscribe = this.sequentialRunner.on(eventName, listener);

        return () => {
            parallelUnsubscribe();
            sequentialUnsubscribe();
        };
    }

    async runTests(tests: Test[], watcher: TestWatcher): Promise<void> {
        await Promise.all([
            this.sequentialRunner.runTests(tests, watcher),
            this.parallelRunner.runTests(tests, watcher),
        ]);
    }
}

class ParallelTestRunner extends TestRunner {
    async runTests(tests: Test[], watcher: TestWatcher): Promise<void> {
        const isParallelizeable = (test: Test) => !test.path.includes(".sequential.");
        const parallelizeableTests = tests.filter(isParallelizeable);

        await super.runTests(parallelizeableTests, watcher, { serial: false });
    }
}

class SequentialTestRunner extends TestRunner {
    async runTests(tests: Test[], watcher: TestWatcher): Promise<void> {
        const isSequentialTest = (test: Test) => test.path.includes(".sequential.");
        const sequential = tests.filter(isSequentialTest);

        // { serial: false } triggers the parallel test execution leveraging workers,
        // but along with { maxWorkers: 1, workerIdleMemoryLimit: ... } configuration
        // this sole worker would be killed and restarted if it exceeds the memory threshold.
        await super.runTests(sequential, watcher, { serial: false });
    }
}
matloob commented 2 months ago

cc @samthanawalla

I think we should think harder about if there's a way to better support having the individual tests synchronize access to the external resource. I wonder if there's a better way to act on tests being blocked.