JustFixNYC / nycdb-k8s-loader

Loading and updating of NYC-DB data via containerized batch processing.
6 stars 2 forks source link

Only download and load a dataset if it has changed #10

Closed toolness closed 5 years ago

toolness commented 5 years ago

This adds a mechanism to ensure that we only download and load a dataset if the server says it has been modified since we last downloaded it.

With this in place, we can trigger even the monthly and yearly jobs on a daily basis, ensuring that they are updated as soon as a new version of the dataset is made available (it appears many of these datasets are not updated on a fixed schedule, so this would be useful), without incurring many unnecessary hours of compute time.

To do