UnkindPartition / tasty

Modern and extensible testing framework for Haskell
640 stars 109 forks source link

Slow start to the tests #175

Closed LeventErkok closed 6 years ago

LeventErkok commented 7 years ago

I've a test suite with > 30K test cases in it. When I run the test-suite directly, it takes maybe about 5-6seconds to actually start running the tests, which seems reasonable for start-up etc.

But if I pass a "-p pat" argument, even for a simple concrete pattern, it takes tasty a good minute before it actually runs the matching tests.

Am I correct to assume that the extra time spent is when tasty is trying to figure out exactly what tests to run? If that's the case, perhaps the code for filtering the test cases suffers from some major slow-down, and might be improved?

This is rather unfortunate, because using patterns is so useful when you only want to see the results of a few test cases run to get quick feedback, which gets slow down immensely because of the filtering; almost defeating the purpose.

I'd be happy to lend a helping hand to get this resolved, if needed.

UnkindPartition commented 7 years ago

Ha! I wasn't planning for 30k tests when I was writing tasty. But we should certainly fix it.

Could you create a simple example that demonstrates the problem?

LeventErkok commented 7 years ago

The following seems to demonstrate:

-- a.hs
module Main(main) where

import Test.Tasty
import Test.Tasty.HUnit

main = defaultMain tests

tests :: TestTree
tests = testGroup "tests"
           [testGroup ("testGroup" ++ show i)
                      [testCase (show i ++ "_" ++ show j) (j @?= j) | j <- [1 .. 3000]]
           | i <- [1 .. 10]
           ]

I've put the above in a.hs and compiled with ghc -O2 -o a a.hs.

If simply run the generated executable, it starts running the tests right away:

$ time ./a --hide-successes

All 30000 tests passed (1.92s)
./a --hide-successes  1.47s user 0.55s system 98% cpu 2.052 total

Note that the whole thing ran in just about 2 seconds. But look what happens when I want one individual test case:

$ time ./a -p 9_2567
tests
  testGroup9
    9_2567: OK (0.01s)

All 1 tests passed (4.95s)
./a -p 9_2567  7.50s user 0.17s system 99% cpu 7.704 total

So, if I run the entire test suite I spend about 2 seconds. If I run just one test, I spend 7.5 seconds. (I'm not quite sure what the tasty reported 4.95s refers to in there.) And of the 7.5 seconds the executable ran for, the actual test itself took only 0.01 seconds! The rest was merely finding which test to run I guess. This is very counter-intuitive: Because one can imagine the -p parameter working exactly like the case where there is no pattern given, and making a call to whether to run the test as you consider each test-case. Surely that should run in comparable time to running all the test cases; with a minute overhead in each case. Something else must be going on in between that causes this delay.

In my actual use case the numbers are more drastic, where the individual test run doesn't start for a good minute. But that one has more nesting of test cases. It seems the more nested the tests are, the more time consuming the search becomes. I guess this is to be expected.

But perhaps we can do better to speed up the test finding time? Of simply push the "run-this-test" decision to right before each test case is considered instead of filtering first?

UnkindPartition commented 7 years ago

Ok, one source of slowdown is definitely this: https://github.com/feuerbach/tasty/blob/master/core/Test/Tasty/Patterns.hs#L119

testPatternMatches is called for every test. So buildTokenRegex and the subsequent compilation from String to an NFA is not shared across different tests, for no good reason.

If you need a quick fix, you can look up how to compile a regex in regex-tdfa and rewrite that function so that it precomputes the regex and shares it across all invocations.

However, I've been planning to rewrite the way patterns work anyway (#38), so I'd rather treat this as another reason to start working on that. I can't commit to any deadlines, but with some luck it'll be ready in 1-2 months — and I'll make sure it performs well on 30k tests.

Would that work for you?

PS I'm a big fan of your thesis — it's been only a week since I last used MonadFix :)

LeventErkok commented 7 years ago

Thanks for looking into this! I haven't studied the code, but would simply giving a name to that regexp match solver the problem when you call any? If so, this could be a one line temporary fix, that I'd very much appreciate if it gets pushed out sooner than later.

If it's more complicated than that, then yes.. By all means this can be addressed when you overhaul the patterns.

(That's a cunning use of mfix there, one I haven't thought about before though is reminiscent of other "future label" like tricks. I quite like it! Thanks for sharing.)

LeventErkok commented 7 years ago

@feuerbach I redid the tests after your patch, and here's the new timing data:

$ time ./a --hide-successes

All 30000 tests passed (1.86s)
./a --hide-successes  1.45s user 0.53s system 99% cpu 1.989 total

$ time ./a -p 9_2567 --hide-successes

All 1 tests passed (1.71s)
./a -p 9_2567 --hide-successes  2.60s user 0.04s system 99% cpu 2.650 total

A most definite improvement in this case, down from 7.7s to 2.6s.

UnkindPartition commented 6 years ago

Closing: the overhead of pattern matching seems to be acceptably low now.