ccao-data / data-architecture

Codebase for CCAO data infrastructure construction and management
https://ccao-data.github.io/data-architecture/
6 stars 4 forks source link

Create IDOT features #620

Open Damonamajor opened 1 month ago

Damonamajor commented 1 month ago

This PR takes the output of 617 and transforms the data into proximity features. Basic Steps:

Ongoing questions:

Damonamajor commented 5 days ago

@wrridgeway

Questions:

wrridgeway commented 4 days ago

At the moment highways are coded as highway_roads. This doesn't make sense linguistically, but does make sense when keeping columns in the same structure / if someone wants to find road columns. For example prox_nearest_highway_road_dist_ft. Thoughts?

I'd actually prefer a road prefix rather than suffix for all the roads, highway included.

The features in the added model in model.vw_pin_shared_input.sql are coded as nearest_collector_road_lanes. Do we want to keep them in proximity, since it is proximity to nearest road or have them in another chunk (maybe environment or it's own chunk)?

I'm fine with it as is.

The initial dataset was named as traffic. It makes sense to rename this to roads now, but the workflow is wonky with aws uploads? Do we want to rename this in a separate PR?

I realize it's a pain in the ass, but getting the names right before we merge things into master is preferable. I'm happy to run anything if you need me to.

A lot of highways have 1 lane. Do we want to just filter to highways with more than 1 lane since these are mostly service lanes on the side of the road?

Let's leave it for now and then we can open a new PR after we're done here investigating this kind of additional processing.