Create dag to update the forecast of the covidch dashboard.

eduardocorrearaujo commented 2 years ago

In this PR I added to scripts:

covid_dash_train_dag.py: This code must create a dag that will run each 2 months (only after the FOPH DAG run) to train and save the ML models that will be used to generate the forecasts shown in the dashboard.
covid_dash_for_dag.py: This code must create a dag that will run every week after the FOPH DAG run to generate the forecasts shown in the dashboard. This dag will load the models saved in the covid_dash_train_dag.py.

This code depends on some functions that were saved inside the dags folder in the path: Epigraphhub/containers/airflow/dags/scripts/dashboard/covidch.

In the Epigraphhub/containers/airflow/dags/scripts/dashboard/covidch/config.py I put a variable called PATH that is loaded by the dags and determines where the trained models should be saved. I don't know how should I define it to run in the server.

@luabida I would like to ask your help to verify how I should concatenate the tasks, I don't know if I did it right. I tried to follow your example in the PR opened.

luabida commented 2 years ago

I was not able to run the DAGs, the train took almost two hours until my machine freeze, I'll let run again, but can you confirm if this DAG takes this long? If so, maybe we should split into smaller tasks.

Another subject is the logging for the methods, the DAGs provide a endpoint to the logs, but they have to be added in the code. I recommend you using loguru because of the timestamp and some other functions.

eduardocorrearaujo commented 2 years ago

I was not able to run the DAGs, the train took almost two hours until my machine freeze, I'll let run again, but can you confirm if this DAG takes this long? If so, maybe we should split into smaller tasks.

Another subject is the logging for the methods, the DAGs provide a endpoint to the logs, but they have to be added in the code. I recommend you using loguru because of the timestamp and some other functions.

Yes, it takes really long to run on my machine too. Maybe instead of running to all the cantons in a single function, we can create dags to run by canton.

luabida commented 1 year ago

I've started running covich_dash_train_dag.py yesterday 18h15, externally triggered after a little change I made in the ExternalTaskSensor (sorry that was my bad): external_task_ids=["done"] -> external_task_id="done"

Another detail is that the timeout variable is in seconds, not minutes, so it should be increased to 300 or more (foph DAG takes approx. 3 minutes to finish).

I was able to test the external trigger by changing the "@monthly" schedule to */15 * * * * on both covich and foph, and disabling foph DAG after first trigger to prevent triggering it again.

The dag ran for about 3 hours consuming 100% of 12 CPU cores all time through, each task consuming about 3-6% of the CPU.

26 out 27 trains were successful. The failed one were train_new_hosp_FL with the error IndexError: list index out of range.

In order to run, I had to add a line of code in epigraphhub.analysis.forecast_models.ngboost_models.py:

if save:
    if path != None:
        Path(path).mkdir(exist_ok=True, parents=True)

There is no need for the config.py file, you can replace the engine in each module by these lines below, if you have the ~/.config/epigraphhub.yaml credential file configured (ref PR #169. The PATH variable can rather be at __init__.py or the _config.py.

from epigraphhub.connection import get_engine
from epigraphhub.settings import env 

engine = get_engine(env.db.default_credential)

I'll add some inline comments about missing imports as well. Thank you for your hard work here!

fccoelho commented 1 year ago

@luabida can you review again, so we can merge?

fccoelho commented 1 year ago

Merging this, finally.

thegraphnetwork / EpiGraphHub

Create dag to update the forecast of the covidch dashboard. #155