Open MarionBWeinzierl opened 4 months ago
Possible solution: Use git-worktree to handle running the profiling code on multiple commits for comparison.
Is this because we don't want the profiling code and data to change when moving between commits? We could move the profiler functionality into its own repo to isolate it? Then we would 'just' need to give a path to the pyrealm
repo and the two commit hashes to compare?
Git worktree actually works quite nicely for this, so I am happy with that.
The only issue I am having at the moment is that, while the signature for PModel
has changed a while ago, the profiling code has not been adapted at the same time. Therefore, the profiling falls over for commits from that time period. I am thinking about how to elegantly tackle this. Changing the profiling code in place, or copying over/using the new (corrected) version seem clumsy, but I struggle think of something better atm.
Ack - that is ugly. I think I'm happy to accept that the new profiling only works from "now" but I don't know how we define now in the implementation.
More broadly, do we need a mechanism to check that the profiling code continues to run as the code base changes? I mean, we have one, in that they are run by pytest
but we don't want them run on every check and we presumably don't want them run on the GitHub actions...
I guess what we could do is set those tests up in pytest
with an minimal data input that will always run and validate the profiling code runs but do it fast, and then the manual profiling command somehow switches in a different data input that provides a suitably heavy test load. Maybe by setting environment variables of paths to large datasets for full tests and have the profiling tests check for those before falling back to the light load?
I might be overcomplicating this, but wondering whether ReFrame, which is quite popular for HPC applications and system tests, would be another option. But maybe that's something to consider as potential next step. And maybe it's too powerful for this.
I created #345 to fix the signature issue, and implement a GH Action to prevent something like that from happening again for the profiling code.
After #242 , it was decided to disable the automatic profiling in the GH Actions. Manual profiling can still be done running
poetry run /usr/bin/time -v pytest -m "profiling" --profile-svg
.What we want now is the ability to compare the results of two profiling runs of different commits in order to check whether performance has deteriorated (or improved) through code changes.