ImperialCollegeLondon / paricia

Hydroclimatic data collection and information system
https://imperialcollegelondon.github.io/paricia/
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Use `huey` to schedule data ingestion #286

Closed dalonsoa closed 1 week ago

dalonsoa commented 2 weeks ago

Data ingestion in the database can be very slow, specially for large datasets. Setting up a scheduler like huey for that purpose will improve the user experience.

This should:

  1. Add a status field and a ingestion logs field in the DataImport model that communicates the status of the ingestion to the user.
  2. Move most of the contents of DataImport.clean to a separate function that is setup as @db_task and that updates the
  3. Implement a post_save signal for the DataImport method that actually loads the data in the db asyncronously by calling this @db_task . This will update the fields indicated in 1 based on the status of the loading.

Obviously, huey should be properly added as a dependency and configured.