dataverbinders / nl-open-data

A Flexible Python ETL toolkit for datawarehousing framework based on Dask, Prefect and the pydata stack
https://dkapitan.github.io/nl-open-data
MIT License
0 stars 1 forks source link

clean up task not working properly #79

Closed galamit86 closed 3 years ago

galamit86 commented 3 years ago

The tmp folder holding the json and/or parquet files is always left on disk after upload_to_gcs is concluded. Sometimes it remains full of files - filling up the memory, which will cause an error eventually.

clean_up_task (or remove_tree in a previous attempt) is supposed to take care of it, but is implemented poorly.