ROCm / Tensile

Stretching GPU performance for GEMMs and tensor contractions.
MIT License
218 stars 147 forks source link

Reduce extended test time #1882

Closed AlexBrownAMD closed 8 months ago

AlexBrownAMD commented 8 months ago

This change refactors a number of extended tests, with the main purpose of reducing the overall run-time of the test suite. Some test files had the number of kernels or problem sizes reduced. The aim is to continue testing all the main features listed in the test description, but reduce repeat or similar cases.

Some tests were also refactored in a few ways: A few test files were not testing the particular cases described in the title or comments. Some test groups generated no kernel solutions at all, or none relevant to the test scenario. Some were performing extra steps that are not being validated in this process, like generating unused library logic files. These sorts of issues are also resolved where they were found in this investigation.

I also tried to reduce the total number of combinations being generated by test files. Combinations slow down the test suite in a few ways. First, combinations take time to permute. Some tests were spending significant time permuting solutions where the majority were rejected in SolutionStructs. Second, large number of solution kernels also takes significant time to compile (particularly source kernels). In this case I tried to reduce cases that were too similar, or spread out testing various features among test suites. Finally, some tests would generate kernels that would be skipped for given sizes due to incompatible asserts.

Based on local runs, this PR saves an estimated 2.25 hours of runtime on the extended test suite.