Refactor tests to be runnable on specific backends

shazow commented 8 years ago

It'd be really useful to run the test suite on specific backends, rather than requiring the suite to run on all-or-nothing, especially since many of the backends require specific system-level package requirements.

Problem

Right now, the test suite hardcodes all the backends in each test scenario. It looks like this:

    tests := map[string]struct {
        repo interface {
            ResolveBranch(string) (vcs.CommitID, error)
        }, ...
    }{
        "git libgit2": {
            repo:         makeGitRepositoryLibGit2(t, gitCommands...), ...
        },
        "git cmd": {
            repo:         makeGitRepositoryCmd(t, gitCommands...), ...
        },
        "hg native": {
            repo:         makeHgRepositoryNative(t, hgCommands...), ...
        },
        "hg cmd": {
            repo:         makeHgRepositoryCmd(t, hgCommands...), ...
        },
    }

These test sets are hardcoded in every test suite scenario. It's impossible to tell the test suite to only run it on the libgit2 implementation, for example.

Proposal

I propose refactoring the tests so that each suite scenario is only aware of a single vcs interface implementation, and it's called N times for each implementation we want to test. Additionally, we could add the ability to select which backend should be tested using go -tags.

For example, go test -tags gitcmd should only run the test suite on the gitcmd backend.

Steps

Abstract the setup functions (make*Repository*) behind a common interface that can be provided.
Replace the test backend matrix with either a registry (probably simpler) or a parameterized test runner. (Thoughts?)

shazow commented 8 years ago

I'm open to suggestions about the implementation, but I'm currently imagining something like backends_test_{git,gitlib2,hg,hgcdm,etc}.go where each one registers the appropriate interface implementation with the suite runner (thinking of using something like testify's suite). Selecting which backend is available during the test could be done using something like build constraint tags.

If we have a // +build gitlib2 header on backends_test_gitlib2.go, then call go test -tags gitlib2, then it would run it just for that backend. Could also do go test -tags gitlib2,hg,hgcmd for example.

By default, I'd like for all the backends to run, unfortunately I'm not sure that's possible with build constraint tags. We could have // +build gitlib2 all and then run go test -tags all, but it would still only run iff gitlib2 or all tag is specified. Need to investigate this more.

dmitshur commented 8 years ago

something like backends_test_{git,gitlib2,hg,hgcdm,etc}.go

Minor comment, test files need to end with _test.go, so perhaps backend_{git,gitlib2,hg,hgcdm,etc}_test.go.

By default, I'd like for all the backends to run, unfortunately I'm not sure that's possible with build constraint tags.

Having all backend tests run by default, without needing extra options (and therefore people to read something in README) is definitely very preferable.

I think it might be possible to use build tags in another way to make that happen. Use them as "negative build tags" to speak. For example, if there are 2 backends git and gitcmd, you can have // +build !gitcmd inside backend_git_test.go and // +build !git inside backend_gitcmd_test.go. If there are 3 backends, you'll need each one to have N-1 negated build tags.

But that way you can still run an individual backend, and default behavior is to run all.

shazow commented 8 years ago

Having all backend tests run by default, without needing extra options (and therefore people to read something in README) is definitely very preferable.

Could you expand on this motivation?

For CI, it's not hard to add a go test -tags all. For somebody checking out the project for the first time, it's likely they won't have or need all the dependencies to run all the backends. As long as they run the tests on the backend they're changing, that's all that matters right?

(Unrelatedly, I noticed that the tests make tons of temporary directories which don't get cleaned up.)

shazow commented 8 years ago

Looking at this more, we'll need somekind of setup/teardown per baseline repo type (hg and git right, where we use gitCommands and hgCommands). Are there plans to add more repo types, or this it for now?

Right now I'm imaging a flow like this for each test:

func TestRepository_Foo(t *testing.T) {
    gitTests := GitTests{
        SetUpCommands: []string{
            ...
        },
        TearDownCommands: []string{
            ...
        },
    }
    hgTests := HgTests{
        ... // (Same as GitTests but with hg-specific setup/teardown)
    }

    // Each test suite type comes with a package-global registry for the implementaitons available.
    wantCommitId := "..."
    gitTests.Run(func(repo vcs.Repository) {
        commitID, err := repo.ResolveBranch(test.branch)
        if err != nil {
            t.Errorf("%s: ResolveBranch: %s", label, err)
            continue
        }

        if commitID != wantCommitID {
            t.Errorf("%s: got commitID == %v, want %v", repo.String(), commitID, wantCommitID)
        }
    })

    hgTests.Run(...) // Same deal here, but with hg-specific want params.
}

In fact, the differing parameters could be abstracted away in a closure as such:

func TestRepository_Foo(t *testing.T) {
    ...

    func FooTester(wantCommitId vcs.CommitId) func(vcs.Repository) {
        return func(repo vcs.Repository) {
            commitID, err := repo.ResolveBranch(test.branch)
            if err != nil {
                t.Errorf("%s: ResolveBranch: %s", label, err)
                continue
            }

            if commitID != wantCommitID {
                t.Errorf("%s: got commitID == %v, want %v", repo.String(), commitID, wantCommitID)
            }
        }
    }

    gitTests.Run(FooTester("gitcommitidhere"))
    hgTests.Run(FooTester("hgcommitidhere"))
}

It's not ideal and a fairly big change, so it may be worth punting on it for now. The only other method I can think of where we can avoid having at least N definitions per test (N = number of baseline repo types) is if each repo type has its own fork of the tests (which might not be a terrible idea).

dmitshur commented 8 years ago

Sorry, I haven't had much time to look at this.

Could you expand on this motivation?

For CI, it's not hard to add a go test -tags all. For somebody checking out the project for the first time, it's likely they won't have or need all the dependencies to run all the backends. As long as they run the tests on the backend they're changing, that's all that matters right?

It's just simpler, both for existing developers/maintainers of this library, and new people, in my opinion. I think it's easier to just follow the directions to get all backends setup and run go test -v ./... and be sure everything is passing, than to try to figure out how to run all specific endpoints and hope you haven't missed any. When I encounter a new repo and realize that running its tests is something other than go test ./..., I see that as a flaw, not an improvement.

Usually when working on a feature, I would limit it to just a single test with go test -run=JustThisOneTest ./vcs or so. I have one repository that uses 3 build tags to separate functionality/tests (compare to this much simpler one), and IMO the extra complexity is not worth it (I'm working on removing those build tags).

Are there plans to add more repo types, or this it for now?

There are no short-term plans that I'm aware of. But the go-vcs interface is quite general, and it's possible either we or someone else would want to implement it for some future VCS. I don't think anything new will be added anytime soon though, since git and Mercurial are pretty popular.

shazow commented 8 years ago

Fair enough. In that case, I'd propose making an interface-specific/implementation-agnostic test suite helper in the vcs package, then moving each implementation-specific test calls into the respective sub-packages. This way if we run go test ./... it will run the entire subtree of tests. If we run go test ./git/... it will only run the git tests, etc.

Downside is the instrumentation for each vcs type will live separately, but I think that's a reasonable compromise. Thoughts?

sourcegraph / go-vcs