Profiling issues - Githubissues

There are some issues with the profiling system that it would be good to fix up for 1.0.0.

I think the list below is going to grow - we might have some revealed bugs to work on.

[x] The graphviz requirement for profiling is currently in the main package requirements ([tool.poetry.dependencies]) and this is causing package installation to break. Trying to install the test release (pip install -i https://test.pypi.org/simple/ pyrealm==1.0.1a3) breaks with: ERROR: Could not find a version that satisfies the requirement graphviz (from pyrealm) (from versions: none) and we really don't want graphviz as a package requirement anyway.
[x] The profiling isn't running - see for instance this action job. I think this is because the following line refers to a step that no longer exists, so can never equal success. https://github.com/ImperialCollegeLondon/pyrealm/blob/971d1a3c492df3fcaa84279ed7a2f74275e6169e/.github/workflows/pyrealm_ci.yaml#L60
[x] That step runs on every matrix combination in the test job. There's value in looking for performance hits in different python versions, but we'd have to extend the profiling database to include matrix details. That seems like overkill for now.
[x] The profiling output for a run is tagged by date and github event name (typically push), but it would be more useful to tag by commit SHA.
[ ] I'm still not sure about when this runs - it is currently on a ton of events (push, etc), which seems overkill. We may want to break this out into a separate job that only runs on merge into develop or main? ETA: I'm not sure this is avoidable - every push to a PR results in a commit that could be merged to main or develop, so needs to pass this. The alternative is probably cron jobs to do regular checks on the develop branch.
[x] The current code filters the profiling results using the text pyrealm, but this runs into issues when running locally, particularly with local poetry virtual environments, which include pyrealm in the path to site packages. The filtering needs to be more controlled.
[x] There isn't a clear mechanism to handle benchmarking failures that we want to ignore - simply overriding the test on GitHub doesn't leave much of a trace.
[x] The process id information used for profiling (file, lineno, function name) are not portable between commits for benchmarking. Method names are not qualified by their class and can method names be duplicated within a source file, and line numbers of function definitions are not stable between commits.
[x] The workflows for manual benchmarking and CI benchmarking need to be revised and documented to give a longer explanation of the process and clear models of how to use the profiling.

ImperialCollegeLondon / pyrealm

Profiling issues #207