UCL / TLOmodel

Epidemiology modelling framework for the Thanzi la Onse project
https://www.tlomodel.org/
MIT License
10 stars 5 forks source link

Tracking profiling run results #686

Closed matt-graham closed 5 months ago

matt-graham commented 2 years ago

We would like to be able to track how the the timings measured in profiling runs of the src/scripts/profiling/scale_run.py script changes as new pull-requests are merged in. This would help identifying when PRs lead to performance regressions and allow us to be more proactive in fixing performance bottlenecks.

Ideally this should be as automated using GitHub Actions workflows. Triggering the workflow on pushes to master would give the most detail in terms of giving a direct measurement of the performance differences arising from a particular PR, but when lots of PRs are going in could potentially create a large backlog of profiling runs, so an alternative would be to run on a schedule (for example nightly) using the cron event. It would probably be worth also allowing triggering either using the workflow_dispatch event or using the comment-triggered workflow functionality to allow manually triggering in PRs that it is thought might have a significant effect on performance before merging.

Key questions to be resolved are what profiling outputs we want to track (for example at what level of granularity, using which profiling tool) and how we want to visualize the outputs. One option would be to save the profiler output as a workflow artifact. While this would be useful in allowing access to the raw profiling data, the only option for accessing workflow artifacts appears to be downloading the artifact as a compressed zip file so this is not necessarily itself that useful for visualizing the output. One option for visualizing the profiling results would be to use the GitHub Actions job summary which allows using Markdown to produce customized output showed on the job summary page. Another option would be to output the profiling results to HTML files and then deploy these to either a GitHub Pages site or potentially to a static site on Azure storage.

Potentially useful links

The airspeed velocity package allows tracking the results of benchmarks of Python packages overtime and visualizing the results as plots in a web interface. While focused on suites of benchmarks it does also have support for running single benchmarks with profiling.

htmlpreview allows directly previewing HTML files in a GitHub repository as GitHub forces them to use the "text/plain" content-type, so they cannot be interpreted

willGraham01 commented 1 year ago

The developer onboarding says that we currently use pyinstrument to benchmark the scale_run script, so I thought I'd make a quick few comparisons against ASV:

ASV

pyinstrument

The maintain-ability (?) issue jumps out as something of a red flag to me, but asv otherwise looks to have slighly better features at the cost of needing a dedicated machine. pyinstrument seems more flexible however; it's fairly easy to write a psuedocode GH action workflow using it right away:


- Checkout repository

- Setup conda
- Setup conda envrionment from developer/user docs
- Install pyinstrument into the evironment

- Run pyinstrument producing a HTML output (and maybe a session output so we can reload later)

- Push HTML file somewhere? Maybe to a separate branch that we an manually view the files with htmlpreview?
willGraham01 commented 1 year ago

A couple of options (more details in this file)

The wgraham/asv-benchmark and wgraham/pyinstrument-profiling-ci branches have (locally working, still need to fix the broken tests!) implementations of both ASV and pyinstrument for the tasks above (on a 1month long simulation so the results get produced in ~2mins).

Opinions welcome: the github-pages branch of this repository is un-used so we can initially send the HTML outputs to there for viewing.

matt-graham commented 1 year ago

Some notes from meeting of @tamuri, @willGraham01 and myself today to discuss this issue

willGraham01 commented 1 year ago

Statistics to potentially capture:

The kind of things to monitor;

File sizes of the pyisession outputs

NOTE: Even a 1-month simulation produces a pyisession file that is ~300MB, which is well above GitHub's 100MB standard limit. We can either:

tamuri commented 8 months ago

At some point, we can move the profiling repo into the TLOmodel org (https://github.com/TLOmodel).

matt-graham commented 5 months ago

Closing this as profiling workflow now capturing statistics and working reliably