appsembler / figures

Reporting and data retrieval app for Open edX
MIT License
44 stars 37 forks source link

run separate tasks per site in daily jobs #470

Closed melvinsoft closed 2 years ago

melvinsoft commented 2 years ago

Change description

This PR introduces a new version of the function populate_daily_metrics, this time what it does, instead of running the whole process inside a unique celery tasks, it creates a single tasks per each site and throw them in a celery chord. I know the change is no the ultimate solution, since site can have really different sizes, but still, having separate celery tasks per site, will allow us for example, to run production deployments at any time.

I'm also planning to to create a new queue for figures with a good concurrency (around 20) which on top on John's recent improvements (lower down the processing time from 10 to 2hs) should make the pipeline really fast to complete.

The PR also adds a new version of populate_daily_metrics_for_site called populate_daily_metrics_for_site_async_sites.

I already tested the code in Staging pulling it manually and it works, but I'd like to configure the new function and have it running for a few days in Staging.

Type of change

Related issues

No relates issues

Checklists

Development

Security

Code review

johnbaldwin commented 2 years ago

@melvinsoft I only skimmed over it. Looks reasonable but I want to look at it again in the morning when I'm fresh.

One thing we will need to add is test coverage for the new task functions. Tests for the current daily functions is in here: https://github.com/appsembler/figures/blob/main/tests/tasks/test_daily_tasks.py

This also include logging checking with caplog (just search the test module for it), which you will probably want to test daily_metrics_callback

Thanks!

OmarIthawi commented 2 years ago

Thanks @johnbaldwin. I'll be continuing the work on this feature in #471. Maxi is handing over this and other celery work to me.

One thing we will need to add is test coverage for the new task functions. Tests for the current daily functions is in here: https://github.com/appsembler/figures/blob/main/tests/tasks/test_daily_tasks.py

This also include logging checking with caplog (just search the test module for it), which you will probably want to test daily_metrics_callback

Got it!