mrkkrp / zip

Efficient library for manipulating zip archives
Other
81 stars 27 forks source link

Speed up the test suite #55

Closed nh2 closed 4 years ago

nh2 commented 5 years ago

Hey,

the test suite is commendable, but it also takes very long (44 minutes) on my Xeon E31245 server when building it as one of many packages in static-haskell-nix.

Is it possible that some QuickCheck Gens generate unfavourably large input?

Running it manually, I noticed that for every test section, the quickcheck counter counts up to around 50/100 almost immediately, but beyond that counts up very slowly.

For example, at

forEntries
75/100

it got stuck at 100% CPU for half a minute.

No syscalls appear to be done when that happens, it seems to be pure CPU time being spent. I am not sure if that is just time spent in the actual compression algorithms, or something else in Haskell.

Do the tests also take extremely long for you?

forEntries, undoAll and packDirRecur seem to be some of the extra brutal ones.

Is there something we could do to improve the duration of the test suite?

Thanks!

mrkkrp commented 5 years ago

OK, let's see here.

It seems to take 5:21 for me on my Dell XPS 13 9370:

zip-1.2.0: Test suite tests passed
Completed 2 action(s). 
317.12user 4.08system 5:21.19elapsed 100%CPU (0avgtext+0avgdata 539132maxresident)k
20280inputs+120024outputs (0major+991726minor)pagefaults 0swaps

What about CI? Well it looks like it takes 12-13 minutes total (i.e. including setting up the environment and building stuff) there:

https://travis-ci.org/mrkkrp/zip

One feature of the test suite is that it manipulates files a lot. So I guess if those operations are slow, the whole test suite will be slow. But this is just a guess, you seem to say that it shouldn't be the case here.

Is there something we could do to improve the duration of the test suite?

The easiest: we could decrease the size parameter of QuickCheck tests, but it looks like the test suite is super slow only for you. Both locally and on CI I find the execution times reasonable.

mrkkrp commented 5 years ago

@nh2 What is the status of this issue? Are my arguments compelling? Do you still want a change? Should we close it?

nh2 commented 5 years ago

@mrkkrp Hmm, odd that it's so much faster for you. The server I tried it on doens't have an SSD, but I would imagine most of this stuff should happen in the buffer cache (thus, in RAM). So it is quite odd to me that my dedicated Xeon is ~4x slower on the test suite than the TravisCI VM.

My main question is: It seems most of the test suite's time is spent on a few pathologically-large inputs from QuickCheck. Do those add a lot of value, or would it make sense to adjust size (or write some Gens that generate inputs in reasonable ranges that should be tested) to avoid those, given that they seem to mainly "exercise more CPU" on the same code paths?

Eyeballing the output as it scolls by, it sems like the test suite can be made >20x faster if the pathologically-large inputs are avoided, which may be convenient even for your much faster 5 mintues test runs.

mrkkrp commented 5 years ago

Do those add a lot of value, or would it make sense to adjust size (or write some Gens that generate inputs in reasonable ranges that should be tested) to avoid those, given that they seem to mainly "exercise more CPU" on the same code paths?

I do not know :)

Eyeballing the output as it scolls by, it sems like the test suite can be made >20x faster if the pathologically-large inputs are avoided, which may be convenient even for your much faster 5 mintues test runs.

This sounds good. Right now digging this is not the most efficient way to spend my free time, however I may get to this eventually.

I'd be glad to accept a PR that makes the test suite faster.

mrkkrp commented 4 years ago

@nh2 The test suite should pass much faster now. I released 1.3.1 on Hackage.

nh2 commented 4 years ago

@mrkkrp That's great, thanks!