The deepcopy function is used throughout Tensile and one of the primary bottlenecks. This PR removes the deep copies of Solutions - namely those that occur in TensileCreateLibrary. This significantly improves the turnaround time of TensileCreateLibrary by up to 40% depending on the input to the main.
Ideally, we could make the Solution class immutable as a part of these changes (which we attempted) but the solution changes in ways during runtime that would require a major refactor.
To ensure that the program still behaves correctly, we ran the following tests:
Standard CI tests
A/B comparison to devel of manifest generation using TensileCreateLibrary
A/B comparison to devel of manifest generation using rocBLAS install.sh
Ran rocBLAS-tests on the feature branch without failure
The
deepcopy
function is used throughout Tensile and one of the primary bottlenecks. This PR removes the deep copies of Solutions - namely those that occur inTensileCreateLibrary
. This significantly improves the turnaround time ofTensileCreateLibrary
by up to 40% depending on the input to the main.Ideally, we could make the Solution class immutable as a part of these changes (which we attempted) but the solution changes in ways during runtime that would require a major refactor.
To ensure that the program still behaves correctly, we ran the following tests: