dataverbinders / nl-open-data

A Flexible Python ETL toolkit for datawarehousing framework based on Dask, Prefect and the pydata stack
https://dkapitan.github.io/nl-open-data
MIT License
0 stars 1 forks source link

Run full CBS core and external flows #108

Open galamit86 opened 2 years ago

galamit86 commented 2 years ago

To update our CBS database, both run_statline_core and run_statline_external should be run

galamit86 commented 2 years ago

Running the full CBS core does on a VM does not work - flows are deplyoed but never start. The cluster appears unresponsive from Prefect side.

We are possibly overloading the machine, as the new Prefect plan allows for concurrency, which we did not have in the past, and so scheduling all 450 flows did not matter for execution - they still ran 1-by-1.

Now, it's possible that trying to a few hundred flows at once causes failures.