issues
search
MattTriano
/
analytics_data_where_house
An analytics engineering sandbox focusing on real estates prices in Cook County, IL
https://docs.analytics-data-where-house.dev/
GNU Affero General Public License v3.0
9
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Updates are available for Airflow, pgAdmin, and dbt
#51
MattTriano
closed
1 year ago
0
Add (mermaid.js) diagrams to the docs that show system architecture
#50
MattTriano
closed
1 year ago
0
Integrate OpenMetadata with the platform
#49
MattTriano
opened
1 year ago
2
Implement a strategy for dropping "temp_" tables
#48
MattTriano
closed
1 year ago
1
Updates makefile recipes and appearances of command 'docker-compose' in docs.
#47
MattTriano
closed
1 year ago
0
Update makefile to use v2.x.x style `docker compose` commands and test behavior
#46
MattTriano
closed
1 year ago
0
Set expectations for chicago homicide data
#45
MattTriano
closed
1 year ago
0
Configure the system to use the jupyter lab IDE instead of the notebook interface for expectation dev
#44
MattTriano
closed
1 year ago
0
Develop a suite of expectations for Chicago Homicide and Shooting Victimization data
#43
MattTriano
closed
1 year ago
0
Set expectations for data raw cc parcel value assessments
#42
MattTriano
closed
1 year ago
0
Make suite of expectations for data_raw cook_county_parcel_value_assessments data
#41
MattTriano
closed
1 year ago
0
Add docs site
#40
MattTriano
closed
1 year ago
0
Further simplify setup process and update documentation
#39
MattTriano
closed
1 year ago
0
Add a proper documentation site and significantly tighten the focus of the README
#38
MattTriano
closed
1 year ago
1
Explore open source self-service analytics/visualization/BI tools
#37
MattTriano
closed
1 year ago
4
Integrate great expectations
#36
MattTriano
closed
1 year ago
0
Evaluate open source data catalog options for integration into this platform
#35
MattTriano
opened
1 year ago
6
Should the system leave out the great_expectations anonymous_usage_statistics identifiers?
#34
MattTriano
opened
1 year ago
1
Explore workflows for integrating more rigorous data validation and monitoring steps
#33
MattTriano
closed
1 year ago
2
Simplify socrata table metadata init
#32
MattTriano
closed
1 year ago
0
Implement a collector for FCC broadband data to explore internet availability by location
#31
MattTriano
opened
1 year ago
0
Trimming dev code out of chicago_traffic_crashes DAG.
#30
MattTriano
closed
1 year ago
0
The README's images are out of date
#29
MattTriano
closed
1 year ago
0
Refactor socrata table
#28
MattTriano
closed
1 year ago
0
Explore the feasibility of a task that automatically generates the dbt model that identifies new or updated records in the data_raw stage
#27
MattTriano
closed
1 year ago
0
Sort out issues with permissions errors involving directories and files created by Airflow
#26
MattTriano
closed
1 year ago
0
Dev cleaning transform pattern
#25
MattTriano
closed
1 year ago
0
Refactor SocrataTable class into centralized location and import/load them
#24
MattTriano
closed
1 year ago
0
Decide on the number of data cleaning stages and conditions to meet at each stage
#23
MattTriano
closed
1 year ago
1
Develop tests for socrata table metadata
#22
MattTriano
closed
1 year ago
0
Removes experiments from active DAGs and adds Streets_Center_Lines ELT pipeline.
#21
MattTriano
closed
1 year ago
0
Simpler still startup
#20
MattTriano
closed
1 year ago
0
Implement a first_startup makefile target to handle all manual one-time initialization steps
#19
MattTriano
closed
1 year ago
0
Streamline setup
#18
MattTriano
closed
1 year ago
0
Idempotent dbt staging
#17
MattTriano
closed
1 year ago
0
Explore WHERE NOT EXIST behavior when the subquery compares columns with null values
#16
MattTriano
closed
1 year ago
2
geopandas' `to_postgis()` throws an error when attempting to load missing geometry values
#15
MattTriano
closed
1 year ago
0
Investigate dbt's implicit logic for creating a table when the target table doesn't exist
#14
MattTriano
closed
1 year ago
1
Reimplements data_raw.<table> creation task to also create lineage columns
#13
MattTriano
closed
1 year ago
0
The `load_csv_data` task_group doesn't add "source_data_updated" or "ingestion_check_time" columns
#12
MattTriano
closed
1 year ago
0
Integrate dbt into the workflow to add T to the existing EL framework
#11
MattTriano
closed
1 year ago
1
Add makefile + a recipes for creating the .env files, starting the docker-compose app, cleaning up, etc
#10
MattTriano
closed
1 year ago
0
Implement ingestion dags
#9
MattTriano
closed
1 year ago
0
Reimplement socrata dags using the much DRYer implementation
#8
MattTriano
closed
1 year ago
0
Engineer more efficient data loading functions for larger files
#7
MattTriano
closed
1 year ago
2
Implement DAG to clean up downloaded files and XComs in Airflow metadata database
#6
MattTriano
closed
1 year ago
2
Implement basic socrata ingestion
#5
MattTriano
closed
1 year ago
0
Implement utils to handle conditionally ingesting data
#4
MattTriano
closed
1 year ago
1
Refactor project organization to produce smaller contexts for each docker image
#3
MattTriano
closed
2 months ago
0
Create a database table to track table metadata
#2
MattTriano
closed
1 year ago
1
Previous
Next