epiforecasts / covidregionaldata

An interface to subnational and national level COVID-19 data. For all countries supported, this includes a daily time-series of cases. Wherever available we also provide data on deaths, hospitalisations, and tests. National level data is also supported using a range of data sources as well as linelist data and links to intervention data sets.
https://epiforecasts.io/covidregionaldata/
Other
37 stars 18 forks source link

Run country-specific tests when relevant files are modified #466

Closed Bisaloo closed 2 years ago

Bisaloo commented 2 years ago

(I may still be misunderstanding the complex CI system used here so please let me know if I completely wrong about this issue)

Running all country tests is computationally intensive and prone to errors because of temporary outage. For this reason, the choice has been made to run only generic checks on pull requests and to run country specific tests on a schedule.

I believe this approach is suboptimal because it can lead to erroneous changes falling through the cracks. We then have to wait until someone has time to look at the failing checks and investigate the cause of the issue to get it to work again. This means that some countries can stay in a broken state for a potentially long period of time.

A more robust approach IMO would be to run the relevant country-specific tests on pushes & PR based on which files have been modified.

github-actions[bot] commented 2 years ago

This issue has been flagged as stale due to lack of activity

seabbs commented 2 years ago

Thanks Hugo,

I agree it is nice to have country-specific changes being tested against the downloaded data, so thanks for adding this. The tagging files touched aspect of github actions was new to me. I would rather you didn't unilaterally merge PRs without review, especially when there is no detail of the work done in the PR. If easier we can talk about this via slack etc.

The original motivation for this was not to reduce computation but to deal with potential issues with downloading data. Having countries on a timer is there to check if the upstream source has changed.

The issue with the changes you have merged is now if a dataset is broken due to downloading issues (i.e maybe it can't be accessed) then all changes to shared methods etc will be flagged as failing CI. I would rather this was not the case as in the past this essentially made the whole system break and become completely useless. This was the original motivation for testing against downloaded snapshots and one of the primary motivators for having a complex CI system to begin with. Very open to discussion on this of course.