all: recent (?) increase in time to run tests on Travis

sbinet commented 5 years ago

not sure how critical this is (but having an "agile" CI infrastructure is really important to me): it seems each entry in the Travis build matrix is taking about 20mn to build+test. the total for each PR is then around 5h.

I think it's a bit much:

$> go get -t -v ./...        ##  15s
$> go test -a $TAGS -v ./... ## 454s

here is the list of top consumers:

ok      gonum.org/v1/gonum/lapack/gonum 159.983s
ok      gonum.org/v1/gonum/stat/distuv  127.333s
ok      gonum.org/v1/gonum/mat  90.025s
ok      gonum.org/v1/gonum/graph/community  88.905s
ok      gonum.org/v1/gonum/graph/path   46.357s
ok      gonum.org/v1/gonum/optimize 40.068s
ok      gonum.org/v1/gonum/stat/distmv  34.287s
ok      gonum.org/v1/gonum/optimize/convex/lp   33.817s
ok      gonum.org/v1/gonum/fourier  22.269s
ok      gonum.org/v1/gonum/graph/graphs/gen 17.038s
ok      gonum.org/v1/gonum/mathext  6.792s
ok      gonum.org/v1/gonum/stat/samplemv    5.381s
ok      gonum.org/v1/gonum/mathext/internal/amos    5.119s
ok      gonum.org/v1/gonum/stat/sampleuv    3.686s
ok      gonum.org/v1/gonum/blas/gonum   2.841s
ok      gonum.org/v1/gonum/integrate    2.339s
ok      gonum.org/v1/gonum/graph/topo   1.148s

extracted (manually -- some errors not unexpected) from https://travis-ci.org/gonum/gonum/jobs/479273645

perhaps something could be done to somewhat reduce the amount of time it takes to run all these tests?

use a build cache (this will prevent re-running tests across PRs) (example from Go-HEP where I also had to use this feature to decrease the PR turn-around-time)
optimize the tests
reduce the amount of tests

WDYT?

BTW, shouldn't these lines:

script:
 - go get -d -t -v ./...
 - go build -v ./...
 - go test -a $TAGS -v ./...

read instead:

script:
 - go get -d -t $TAGS -v ./...
 - go build $TAGS -v ./...
 - go test -a $TAGS -v ./...

?

btracey commented 5 years ago

I spent some time a bit ago speeding up the distuv tests, but they keep creeping up as we add more distributions. I'm not sure that all that much can be done except to loosen tolerances (which are already less than ideal). I also don't think much can be done about the Lapack test time. The implementations have a significant switch in behavior between blocked and unblocked code, which requires having large matrices to test the outputs. I think the bulk of the time in mat is just that we have a lot of tests and a lot of combinations (though I can check), so again, I'm not sure much can be done there.

The optimize tests can definitely be sped up. The bulk of the time is testing that gradient descent eventually converges, and we can probably set a better limit on that, especially given how simple the algorithm is.

I don't know about the graph tests.

kortschak commented 5 years ago

While I would like to see shorter test times, there is little room to change code to achieve that while retaining reasonable assurance from the tests. I'm also not particularly upset about how long it takes for a PR travis build turn around (the rate limiting step in getting a PR merged is not their machine time); local testing should always be done before pushing and this is where the tight turn around is important. Having said that, improvements from caching are definitely worthwhile aiming for (from a time an energy efficiency perspective both).

I had a look at the two moth expensive graph packages and they are both tightly focused; both have a set of graphs that they work on doing reasonable tests (in the case of community a reasonable amount of unit testing as well because of the complexity of the algorithm). There are a few algorithms in path and community detection is NP hard, so I reckon we're doing quite well.

The change you show probably should be made, but is not that important.

sbinet commented 5 years ago

addressed by #808.

vladimir-ch commented 5 years ago

The Lapack tests take long time because the inputs allow a lot of possible combinations that trigger different code paths and states to be taken, so the test cases combinatorially explode if we want (and we do) to cover them all in the tests. One of the most important variables is the matrix size which affects whether the blocked or unblocked code will be used. The threshold is controlled by what Ilaenv returns and forces us to use large matrices in the tests and this is slow. What we could do is to have a mock implementation of Ilaenv for tests which would return smaller block size and then we could use smaller matrix sizes in the tests. This would be quite some work and I'm not sure if it would be merited, although I admit that the prospect of having even local testing in lapack run considerably faster than what we have now is quite attractive.

kortschak commented 5 years ago

@vladimir-ch Once these are in the cache, that should not be a problem. Are you still seeing this lapack overall when only a single routine there is changed?

vladimir-ch commented 5 years ago

I didn't check but when I touch Dgesvd or other heavy functions, then having the test finish quickly can make a significant difference even when a single routine is changed. Reducing the block size somehow would be the only way to bring the testing time down.

kortschak commented 5 years ago

Yes you're right, any change to a package will invalidate the cached build/test for that package and all dependent packages.

I'm sort of reluctant to have different block sizes in testing and production. A possibility to to have the smaller block sizes for the pr and push tests and then the production block size for gonum/gonum master when the merge has happened as we do for coverage. This would not capture block-size dependent stability problems before merge, but it would guarantee that they would be caught as some point.

gonum / gonum

all: recent (?) increase in time to run tests on Travis #799