uace-azmet / azmet-forecast-qa

Developing QA/QC routines for AZMet
0 stars 1 forks source link

Move away from using `arrow` to save data #46

Closed Aariq closed 1 year ago

Aariq commented 1 year ago

From Will Landau:

If the eventual data is manageable enough, you could download to a temporary file and then return the in-memory data from a target so that it saves to the targets data store. Otherwise, each dataset could be its own target with format = "file". It looks like your db_init and db_updated targets are trying to use a parquet file as a mutable database. This sort of thing is always hard in targets because each target is an immutable step in the DAG.

Instead of using arrow to make a "mutable database" outside of _targets, I could use branching in targets to create a separate target for each month of data. The slow step is querying the API with azmetr, but querying a month shouldn't take that long even for hourly data. It's figuring out how to invalidate the targets that is difficult I guess.