MattTriano / analytics_data_where_house

An analytics engineering sandbox focusing on real estates prices in Cook County, IL
https://docs.analytics-data-where-house.dev/
GNU Affero General Public License v3.0
9 stars 0 forks source link

The `load_csv_data` task_group doesn't add "source_data_updated" or "ingestion_check_time" columns #12

Closed MattTriano closed 1 year ago

MattTriano commented 1 year ago

The ingestion implementation I ended up using for ingesting flat/csv data simply COPYs data from file (via a streamreader), while the geospatial ingestion scheme ingests the data after reading it into GeoDataFrames (which made it very easy to add in the columns in the title, [source_data_updated and ingestion_check_time].

I'd rather update the existing data as I assume new data was published today, so I'll probably manually update my existing data warehouse but implement these fixes to use provide this behavior in stride.