Perl-Toolchain-Gang / Module-Build

Perl module to configure and build modules (what backs most Build.PLs)
https://metacpan.org/release/Module-Build
32 stars 36 forks source link

Module::Build tests take too long #13

Open miyagawa opened 11 years ago

miyagawa commented 11 years ago

When you install/upgrade Module::Build when it's a configure/build dependency for some modules, Module::Build tests usually take a lot of time. I tested it right now with the relatively new computer and it took 1 minute 40 seconds.

This is why I force notest flag in cpanm when it bootstraps Module::Build to the minimum version, so that the user's first experience feels a bit smoother.

i understand it is extremely important to test Module::Build and prevent a broken version from being installed, but when the tests take this long, it would lead people to run with --notest and that's a worse situation. I wonder if some chunk, or probably many, of the tests can be moved to release tests and automated testing (i.e. CPAN testers) only.

Leont commented 11 years ago

Skipping may be an option, but in most cases making the tests faster would be preferable IMO.

prove --timer -br t [13:10:06] t/00-compile.t ................. ok 4854 ms [13:10:11] t/PL_files.t ................... ok 1874 ms [13:10:12] t/actions/installdeps.t ........ ok 1379 ms [13:10:14] t/actions/manifest_skip.t ...... ok 1561 ms [13:10:15] t/add_property.t ............... ok 327 ms [13:10:16] t/basic.t ...................... ok 2362 ms [13:10:18] t/bundle_inc.t ................. skipped: $ENV{MB_TEST_EXPERIMENTAL} is not set [13:10:19] t/compat.t ..................... ok 33146 ms [13:10:52] t/compat/exit.t ................ ok 1154 ms [13:10:53] t/debug.t ...................... ok 1078 ms [13:10:54] t/destinations.t ............... ok 845 ms [13:10:55] t/ext.t ........................ ok 1044 ms [13:10:56] t/extend.t ..................... ok 883 ms [13:10:57] t/files.t ...................... ok 752 ms [13:10:58] t/help.t ....................... ok 460 ms [13:10:58] t/install.t .................... ok 61861 ms [13:12:00] t/install_extra_target.t ....... ok 1773 ms [13:12:02] t/manifypods.t ................. ok 11088 ms [13:12:13] t/metadata.t ................... ok 11793 ms [13:12:25] t/metadata2.t .................. ok 2012 ms [13:12:27] t/mymeta.t ..................... ok 3710 ms [13:12:31] t/new_from_context.t ........... ok 679 ms [13:12:31] t/notes.t ...................... ok 832 ms [13:12:32] t/par.t ........................ ok 8103 ms [13:12:40] t/parents.t .................... ok 404 ms [13:12:41] t/perl_mb_opt.t ................ ok 2311 ms [13:12:43] t/pod_parser.t ................. ok 140 ms [13:12:43] t/ppm.t ........................ ok 17194 ms [13:13:00] t/properties/dist_suffix.t ..... ok 764 ms [13:13:01] t/properties/license.t ......... ok 1146 ms [13:13:02] t/properties/module_name.t ..... ok 1099 ms [13:13:03] t/properties/needs_compiler.t .. ok 3713 ms [13:13:07] t/properties/release_status.t .. ok 4408 ms [13:13:12] t/properties/requires.t ........ ok 1055 ms [13:13:13] t/properties/share_dir.t ....... ok 3798 ms [13:13:17] t/resume.t ..................... ok 754 ms [13:13:17] t/runthrough.t ................. ok 2857 ms [13:13:20] t/sample.t ..................... ok 690 ms [13:13:21] t/script_dist.t ................ ok 808 ms [13:13:22] t/signature.t .................. ok 1253 ms [13:13:23] t/test_file_exts.t ............. ok 486 ms [13:13:24] t/test_type.t .................. ok 599 ms [13:13:24] t/test_types.t ................. ok 1140 ms [13:13:25] t/tilde.t ...................... ok 5317 ms [13:13:31] t/use_tap_harness.t ............ ok 2072 ms [13:13:33] t/versions.t ................... ok 269 ms [13:13:33] t/write_default_maniskip.t ..... ok 318 ms [13:13:33] t/xs.t ......................... ok 6322 ms [13:13:40] All tests successful. Files=48, Tests=1158, 215 wallclock secs ( 0.63 usr 0.15 sys + 109.91 cusr 17.15 csys = 127.84 CPU) Result: PASS

Some of those tests really do take far longer than they should. Specially install.t is rather worrysome. It seems all slow tests use Distgen (t/lib/Distgen.pm) to do Build.PL cycles, I ran NYTProf on t/tilde.t and it turns out to spend 90% of its time in do_system and _backticks.

We could take a good look at what tests really need to DistGen route and which ones don't. I suspect there's a lot to win there.

kenahoo commented 11 years ago

Leon, I completely agree. If the only way to test many things is through a full Distgen cycle, that also probably indicates a lack of a good enough API.

Leont commented 11 years ago

Turns out that in t/install.t, the slowest test we have, Module::Build spends about 75% of its time in Pod::Simple. I guess something could be improved there too.

kentfredric commented 11 years ago

You can get a sizeable speedup installing most of cpan ( mb included ) via exporting HARNESS_OPTIONS=j10

You can't really do it for all modules, as quite a few have race conditions, or order-implicit behaviour. The most common flaw I hit was two tests creating and deleting the same paths, instead of using a File::Temp, which caused confusing failures.

But it most definitely shaves a few seconds of MB install/test.

https://gist.github.com/kentfredric/5646436

Odd points though: For some reason install.t takes only 3000ms, and I can't explain why its running so fast. I know the first time I ran my tests, it took a full 50+ seconds, but now its taking substantially less, as if its caching somewhere, but I haven't worked out /where/ its getting cached, and I've used a clean scratch directory each time.

miyagawa commented 11 years ago

@kentfredric i know I can speed up using -j, or even just skip tests to make it fastest, but the reason I brought up the issue here is to make it fast for everyone, so that the first-time experience for CPAN installation gets smooth, instead of waiting for 3 minutes to start actually building any module that uses MB.

kenahoo commented 11 years ago

The most common flaw I hit was two tests creating and deleting the same paths, instead of using a File::Temp, which caused confusing failures.

Eek, we should fix that.

ewilhelm commented 11 years ago

from Ken Williams on Friday 24 May 2013:

The most common flaw I hit was two tests creating and deleting the same paths, instead of using a File::Temp, which caused confusing failures.

Eek, we should fix that.

At one point, I went through and added $$ to all of the temp dirs to prevent any parallel tests from conflicting like this. If that didn't stick, I'm not sure whether a better abstraction would help or if it just requires more frequent checking to keep it parallel-clean (travis ci with -j999 ... ?)

--Eric


http://scratchcomputing.com

kenahoo commented 11 years ago

More frequent checking would be good, but I think we should use File::Temp instead of $$-insertion.

On Fri, May 24, 2013 at 6:01 PM, Eric Wilhelm notifications@github.comwrote:

from Ken Williams on Friday 24 May 2013:

The most common flaw I hit was two tests creating and deleting the same paths, instead of using a File::Temp, which caused confusing failures.

Eek, we should fix that.

At one point, I went through and added $$ to all of the temp dirs to prevent any parallel tests from conflicting like this. If that didn't stick, I'm not sure whether a better abstraction would help or if it just requires more frequent checking to keep it parallel-clean (travis ci with -j999 ... ?)

--Eric


http://scratchcomputing.com

— Reply to this email directly or view it on GitHubhttps://github.com/Perl-Toolchain-Gang/Module-Build/issues/13#issuecomment-18434618 .

Leont commented 11 years ago

But it most definitely shaves a few seconds of MB install/test.

https://gist.github.com/kentfredric/5646436

The fact that even a -j10 only increases speed by 20% strongly suggests that the tests are IO bound (though the FS caching thing is begging for a retry).

Odd points though: For some reason install.t takes only 3000ms, and I can't explain why its running so fast. I know the first time I ran my tests, it took a full 50+ seconds, but now its taking substantially less, as if its caching somewhere, but I haven't worked out /where/ its getting cached, and I've used a clean scratch directory each time.

In fact, the entire delay seems to be in the very first substantive test of the file. The rest of the tests always run "fast". A profiling confirms it spends almost all of that time surveying all pod you've got installed on the system.

Leont commented 11 years ago

More frequent checking would be good, but I think we should use File::Temp instead of $$-insertion.

Definitely

kenahoo commented 11 years ago

In fact, the entire delay seems to be in the very first substantive test of the file. The rest of the tests always run "fast". A profiling confirms it spends almost all of that time surveying all pod you've got installed on the system.

Yeah. Apparently Pod::HTML wants to find all pod files so it can interpret hyperlinks or something. There's no reason for us to force HTML to be built though, we don't even check that the HTML files succeed. I'm just going to get rid of the 4 config lines that force HTML to be built. It avoids the big scan then, at least for people that don't have Config.pm entries for HTML building.

kenahoo commented 11 years ago

Fixed in c1bbdeb82752ba718a97d647c57a35711cb262bf. t/install.t went from around 40 seconds to around 10.

Leont commented 11 years ago

Fixed in c1bbdeb. t/install.t went from around 40 seconds to around 10.

That just moves the slowness from t/install.t to t/ppm.t (or probably t/par.t). We should give those the same treatment.

Also, I don't think that this ticket should be closed yet, the tests are still far too slow. Probably some stuff should be converted away from DistGen where applicable.

kentfredric commented 11 years ago

@kenahoo my observation about shared-pathing was relative to "all of cpan", not so much Module::Build, but I was using that argument as for why an end user can't be told to set HARNESS_OPTIONS to get a general purpose speed boost when installing arbitrary modules.

I didn't notice any race conditions myself when testing M::B, and I do report any I happen to find, but as a whole, there are still too many things that break with it all-over CPAN for it to be a recommended or default option.

kenahoo commented 11 years ago

@Leont perhaps I have poor disk caching on my laptop, but for me this doesn't just move slowness, it actually eliminates slowness. I'm hesitant to remove things from t/ppm.t because it actually does test the HTML functionality. t/install.t and t/manifypods.t were just calling it frivolously without verifying anything.

I agree that the tests in general are still too slow by an order of magnitude & need work. That doesn't translate well to a ticket though, maybe we can make some smaller goal-oriented tickets.

kentfredric commented 11 years ago

I re-did my parallel vs sequential test run, but this time with a forced system cache drop;

https://gist.github.com/kentfredric/5660897

With a clean VM cache, real wall clock time is twice as long with the sequential run.

@kenahoo so make sure you're doing those tests with a clean VM cache,

sudo sync ; echo 3 | sudo tee /proc/sys/vm/drop_caches
kenahoo commented 11 years ago

@kentfredric Can you do that again with the git master?

Leont commented 11 years ago

I re-did my parallel vs sequential test run, but this time with a forced system cache drop; https://gist.github.com/kentfredric/5660897 With a clean VM cache, real wall clock time is twice as long with the sequential run.

And judging by the numbers, that was on a dual-core machine. Looks excellent :-)

Leont commented 11 years ago

@Leont perhaps I have poor disk caching on my laptop, but for me this doesn't just move slowness, it actually eliminates slowness.

I just did a profiling, it most certainly still does the pod scan.

I'm hesitant to remove things from t/ppm.t because it actually does test the HTML functionality. t/install.t and t/manifypods.t were just calling it frivolously without verifying anything.

I see. I'm not sure it makes sense to do that on a perl that wouldn't do HTML, but then again this may be one of those tests that would be well off not to run on most end-users installs in the first place, like @miyagawa suggested in the OP.

kentfredric commented 11 years ago

Just did the test run again with a built copy of master, no appreciable speed difference for either the sequential or parallel passes.

https://gist.github.com/kentfredric/5664662

As @leont suggested, the speedup in install.t is made up for by a slowdown in ppm.t

though it seems par.t is unaffected, as you'll see in the parallel run, par.t finishes long before ppm.t does.

kenahoo commented 11 years ago

@Leont What I had in mind was to find a way to disable the full-disk pod scan in Pod::HTML, which would let us have our cake (faster tests) & eat it too (keep testing HTML doc building).

kentfredric commented 11 years ago

Indeed, I feel like Pod::HTML doing a full disk scan is a bug in itself, and is probably just waiting for a security flaw to occur as a result. I mean, if perldoc could have security issues just reading the pod for one file, by extension, loading all the files in your perl install is just asking for problems.

Leont commented 11 years ago

Indeed, I feel like Pod::HTML doing a full disk scan is a bug in itself, and is probably just waiting for a security flaw to occur as a result.

It's doing a "laborous" scan, I suspect that turning that off would already help a lot. It's not quite clear to me if anything would break on doing so though.

kenahoo commented 11 years ago

I'm sure we could at least turn it off during the tests. I don't see any stuff in the tests that checks to see whether hyperlinks actually worked.

On Tue, May 28, 2013 at 7:55 PM, Leon Timmermans notifications@github.comwrote:

Indeed, I feel like Pod::HTML doing a full disk scan is a bug in itself, and is probably just waiting for a security flaw to occur as a result.

It's doing a so called laborous scan, I suspect that turning that off would already help a lot. It's not quite clear to me if anything would break on doing so though.

— Reply to this email directly or view it on GitHubhttps://github.com/Perl-Toolchain-Gang/Module-Build/issues/13#issuecomment-18589845 .

kentfredric commented 11 years ago

I was wondering why my reading of the docs for Pod::HTML were a bit lacklustre, ... I should have been reading Pod::Html. FML

miyagawa commented 9 years ago

Sorry if this sounds a bit rude, but this is becoming a lot bigger issue now that Module::Build is deprecated. I've heard from several people at QAH that cpanm's initial installation takes so much time on 5.20 or later, because cpanm pulls the CPAN version whenever they encounter a dist that uses Build.PL.

In the past I added a bootstrap mechanism to install Module::Build with notest flag, and I consider doing this again by forcing notest if the installation target is Module::Build.

Please make the test faster on an end user environment by turning off unnecessary tests.

Leont commented 9 years ago

Two observations:

t/compat.t (which tests Module::Build::Compat) is single-handedly responsible for almost a third of the total runtime. In my opinion his is not in proportional to the importance of the functionality. Likewise a number of other slow tests have such issues on a smaller scale (e.g. ppm.t)

Secondly:

It seems all slow tests use Distgen (t/lib/Distgen.pm) to do Build.PL cycles

is still true. There is no reason to go through a full build cycle to test for each candidate tilde expansion, that is madness. A lot of tests could be rewritten to be significantly faster. We are currently writing unit tests as integration tests (slow), which is really not helping us.

davidbuzz commented 8 years ago

Files=53, Tests=1129, 3519 wallclock secs ( 8.06 usr 1.72 sys + 1485.43 cusr 235.95 csys = 1731.16 CPU) not good enough.

Leont commented 8 years ago

not good enough.

Patches most welcome!