Closed davidorme closed 3 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 95.17%. Comparing base (
971d1a3
) to head (1a99181
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
The actions are all now passing: https://github.com/ImperialCollegeLondon/pyrealm/actions/runs/8633260571
I think the auto-commit is triggering a new push that is stalling somehow. Those profiling auto-commits probably don't need any CI - we could add [skip ci]
to the message?
I think we’ve got issues with the benchmarking. I think it is now working as we intended but three runs of essentially the same code (only the profiling CI workflow is changing) is leading to wildly different relative run times. The two profiling graphs being updated in this commit (https://github.com/ImperialCollegeLondon/pyrealm/pull/208/commits/d6dff3497a1b3400e09b7225ac565be012e6c9b3) show the issue.
It could be that the profiling tests have too small a load - they run really fast - to give consistent behaviour or it could be that I’ve done some mad randomisation of the sort order. I don’t think that’s the case though - I manually triggered fails in testing by altering the database and the correct processes failed the benchmarking. My guess is that runner architecture is going to make this process hard to use within CI. My intuition is that we need a single benchmarking machine to run tests?
Also note that - with the call graph copy in benchmarking job - the failed run_benchmarking.py
has clobbered that line in the run
section, so the call graph is not copied when benchmarking fails. It needs its own step.
I would suggest that we should try going back to a bigger problem size, to make sure we exclude random noise being a major factor in the runtimes.
I agree - I think we can simply tile the current inputs to increase the load. A couple of other things:
As discussed on Slack, we might also want to reduce when this is run, i.e., only run on merge on develop
or main
Description
This PR is to resolve the profiling issues described in #207. It got a bit larger than expected, but there were a few interlinked issues to give clean profiling and benchmarking workflows.
splash
profiling test.lfs
step and to run only on one OS/python version.report.py
to a more descriptiverun_benchmarking.py
and revises the command line structure.get_function_map
that usesast
to label profiled processes by their position in the package tree. This is mostly so that repeated method names within a source file can told apart by their class name and not their line number (which is unstable across versions).read_prof_as_dataframe
andprocess_report_columns
intoconvert_and_filter_prof_file
: these functions would always need to be run together.plot_profiling
andplot_benchmark
have now been merged intocreate_benchmark_plot
, which is a single plot showing relative performance of individual calls across versions.generate_test_database
function, which is mostly just to keep a handy recipe for test inputs when we need to revise this process.CONTRIBUTING.md
, including manual and automated profiling workflow.CONTRIBUTING.md
up and avoid content duplication with thedocs/development
content. (ETA: #209 )prof/combined.svg
-->profiling/call-graph.svg
) out ofreport.py
and into a separate CI step - keeps benchmarking code more focussed.develop
andmain
, so only when PRs get merged. This isn't perfect - a merge might turn out to break profiling - but having the profiling on every commit to a PR is too bulky.I have a broader concern that this benchmarking is going to be continually throwing up issues that arise from different runner specs, but we'll just have to see.
But - that aside: does this all look sane? Does the new graph make sense? Actually, something is wrong there as the plot from a previous failed run hasn't been replaced by the most recent passing run.
@tztsai - it would be great if you could have a look at this, but I realise you're on another project.
Type of change
Key checklist
pre-commit
checks:$ pre-commit run -a
$ poetry run pytest
Further checks