appsembler / figures

Reporting and data retrieval app for Open edX
MIT License
44 stars 37 forks source link

run separate tasks per site in daily jobs #471

Closed OmarIthawi closed 1 year ago

OmarIthawi commented 1 year ago

Change description (re-opens #470)

This PR introduces a new version of the function populate_daily_metrics, this time what it does, instead of running the whole process inside a unique celery tasks, it creates a single tasks per each site and throw them in a celery chord. I know the change is no the ultimate solution, since site can have really different sizes, but still, having separate celery tasks per site, will allow us for example, to run production deployments at any time.

I'm also planning to to create a new queue for figures with a good concurrency (around 20) which on top on John's recent improvements (lower down the processing time from 10 to 2hs) should make the pipeline really fast to complete.

The PR also adds a new version of populate_daily_metrics_for_site called populate_daily_metrics_for_site_async_sites.

I already tested the code in Staging pulling it manually and it works, but I'd like to configure the new function and have it running for a few days in Staging.

Type of change

Related issues

No relates issues

Checklists

Development

Security

Code review

johnbaldwin commented 1 year ago

Re-posting my comment from #470,

One thing we will need to add is test coverage for the new task functions. Tests for the current daily functions is in here: https://github.com/appsembler/figures/blob/main/tests/tasks/test_daily_tasks.py

This also include logging checking with caplog (just search the test module for it), which you will probably want to test daily_metrics_callback

OmarIthawi commented 1 year ago

Closing at the moment but the task is still open: https://appsembler.atlassian.net/browse/RED-3411