usdot-jpo-ode / wzdx

The Work Zone Data Exchange (WZDx) Specification aims to make harmonized work zone data provided by infrastructure owners and operators (IOOs) available for third party use, making travel on public roads safer and more efficient through ubiquitous access to data on work zone activity.
Creative Commons Zero v1.0 Universal
92 stars 62 forks source link

Metadata File #10

Closed bdelsey64 closed 4 years ago

bdelsey64 commented 5 years ago

Consider allowing the metadata to be embedded within the header as an alternative to providing the file location URL. The amount of metadata is small enough that it can be conveniently included, eliminating the need make a separate request for the data.

j-d-b commented 4 years ago

I agree. This data isn't really different than other data on the road_event_feed_info. For example issuing_organization in the metadata the same content as the issuing_organization field on the road event.

I'm for adding all the fields from the metadata to the road_event_feed_info, either by:

  1. Allowing an object (in addition to the string URL as is currently allowed) on the metadata property; or
  2. Moving the fields from the metadata entity directly to the road_event_feed_info

Either way, the current method of providing a separate metadata file through a string/URL would still be supported.

If we want to break backwards compatibility and be a bit cleaner, we could go with option 1. but disallow the URL string value, or option 2. + removing the metadata field.

j-d-b commented 4 years ago

Upon review, the fields on the metadata entity need to be reexamined, I think these were missed with other v2 refactoring. This definitely should be completed for v3 as it's currently unclear and messy.

wz_location_method, for example, is missing from the enumerated types and also is not congruent with the "road event" terminology introduced in v2. Note this may have a relation to #78.

Since refactoring is essential already, I'd suggest option 2 above and we can decide which fields should be kept/added to the road_event_feed_info. We can then include this information in the JSON schema which is helpful for completeness.

All members, please review the metadata fields, determine any that need to be renamed/removed, as well as any that could be clarified/have an enumerated type used rather than unstructured text.

On my initial review:

  1. datafeed_frequency_update could be required to be giving in seconds and thus be an integer value rather than text (this greatly helps parsing). As it will be moved to the road_event_feed_info and backwards compatibility is already lost in several ways, I'd suggest renaming to feed_update_frequency.
  2. timestamp_metadata_update should be removed as the metadata is no longer separate from the feed
  3. wz_location_method needs to be renamed, at least, but data producers should look at this and make sure the values make sense.
CraigMoore-Sea commented 4 years ago

Agree that these two table are duplicating information. Consolidating to one location makes sense. We may need to think about where some of these verification values lie since different records may have different resolutions or methodologies. Some of these may belong at the event/activity level.

DeraldDudley commented 4 years ago

Before advancing the metadata issue the Working Group needs to determine a business rule.

Is it acceptable to combine singular WZD feeds into a larger WZD Feed? E.g. Is it okay for States to combine county feeds into a larger feed?

Why? It affects how metadata is modeled. If yes, then the metadata needs to distinguish between feed publishers and feed sources. E.g. A State is the feed publisher and counties are the feeds sources.

If no, then road_event_feed_info contains metadata about a single publisher and there is no need to distinguish between publisher and source.

Option 1: Single source feeds.

Keep feed_info_id in road_event_feed_info – No Change Keep feed_update_date in road_event_feed_info – No Change Keep version in road_event_feed_info – No Change

Insert issuing_organization into road_event_feed_info Insert datafeed_frequency_update into road_event_feed_info Insert contact_name into road_event_feed_info Insert contact_email into road_event_feed_info Insert location_verify_method into road_event_feed_info Insert wz_location_method into road_event_feed_info Insert lrs_type from into road_event_feed_info Insert lrs_url from into road_event_feed_info

Drop metadata from road_event_feed_info Drop issuing_organization from road_event (Captured in feed_info table) Drop timestamp_metadata_update (Captured in feed_update_date)

Option 1: Multiple source feeds. Design model to distinguishing between Feed Publisher metadata from Feed Source Metadata

sknick-iastate commented 4 years ago

I think their is a need to distinguish between feed publishers and feed sources. Their are example of agencies doing this outside of the WZDx. Maricopa County has shown example of them aggregating data for cities/unincorporated areas and I think the RTC in Las Vegas is doing something similar with their seeing orange.

This hasn't been discussed previously but I think their will be a need for a data aggregator/publishers at some level (whether it be the county/state/etc). It would be good to get a data consumer feedback but I assume they would prefer to not have to go to all 99 counties in Iowa or the even greater number of cities to get this data, assuming widespread adoption.

DeraldDudley commented 4 years ago

Draft metadata solution accommodating multiple data sources per feed.

geojson example illustrating proposed metadata changes: https://gist.github.com/DeraldDudley/21f65d04dea437d76d1c593cc181644e

Updated Road Event Feed Info object road_event_feed_info Proposed Field Proposed Table Original Field Original Table
feed_info_id road_event_feed_info feed_info_id road_event_feed_info
feed_publisher road_event_feed_info issuing_organization metadata
feed_contact_name road_event_feed_info contact_name metadata
feed_contact_email road_event_feed_info contact_email metadata
feed_update_frequency road_event_feed_info datafeed_frequency_update metadata
feed_update_datetime road_event_feed_info feed_update_date road_event_feed_info
New Road Event Source Info object road_event_source_info Proposed Field Proposed Table Original Field Original Table
source_info_id road_event_source_info NA NA
feed_info_id road_event_source_info NA NA
source_organization road_event_source_info issuing_organization road_events
source_name road_event_source_info contact_name metadata
source_email road_event_source_info contact_email metadata
source_update_frequency* road_event_source_info datafeed_frequency_update metadata
source_update_datetime road_event_source_info feed_update_date road_event_feed_info
location_verify_method road_event_source_info location_verify_method metadata
wz_location_method road_event_source_info wz_location_method metadata
lrs_type road_event_source_info lrs_type metadata
lrs_url road_event_source_info lrs_url metadata
beginning_accuracy road_event_source_info beginning_accuracy road_events
ending_accuracy road_event_source_info ending_accuracy road_events
start_date_accuracy road_event_source_info start_date_accuracy road_events
end_date_accuracy road_event_source_info end_date_accuracy road_events
version road_event_source_info version road_event_feed_info

Updated Road Event object road_events

Field Name Data Type Description Conformance Notes
road_event_id ID A unique identifier issued by the data feed provider to identify the work zone project or activity Required Primary Key
source_info_id ID Identifies the source to which a road event is related. Required Foreign Key to road_event_source_info
subidentifier ID A unique identifier issued by data feed provider that provides additional references to project or activity Optional This identifier may be used in more than one feed as a reference to an agency project number or permit ID
geometry_type Enumeration: Multipoint or LineString May be represented as a linestring or a multipoint as defined in the GeoJson specification. Required  
geometry Coordinate(s); Float A coordinate pair or an array of coordinates. In either case, the first coordinate is the beginning point and the last coordinate is the ending point of the road event Required Coordinate pairs and coordinate arrays are formatted according to the geoJson spec
road_name Text Publicly known name of the road on which the event occurs. Required  
road_number Text The road number designated by a jurisdiction such as a county, state or interstate Optional Examples I-5, VT 133
direction Enumeration; Text The digitization direction of the road that is impacted by the event. This value is based on the standard naming for US roadways and indicates the direction the traffic flow regardless of the real heading angle. Required Example northbound (for I-5 North); See Direction Enumerated Type
beginning_cross_street Text Name or number of the nearest cross street along the roadway where the event begins Optional  
ending_cross_street Text Name or number of the nearest cross street along the roadway where the event ends Optional  
beginning_milepost Float The linear distance measured against a milepost marker along a roadway where the event begins Optional A milepost or mile marker is a surveyed distance posted along a roadway measuring the length (in miles or tenth of a mile) from the south west to the north east. These markers are typically notated on State and local government digital road networks. Provide link to description of milepost method in metadata file.
ending_milepost Float The linear distance measured against a milepost marker along a roadway where the event ends Optional A milepost or mile marker is a surveyed distance posted along a roadway measuring the length (in miles or tenth of a mile) from the south west to the north east. These markers are typically notated on State and local government digital road networks. Provide link to description of milepost method in metadata file.
start_date DateTime The UTC time and date when the event begins. Required All date/time formats shall use ISO 8601 Data elements and interchange formats – Information interchange. Example: 2016-11-03T19:37:00Z
end_date DateTime The UTC time and date when the event ends. Required All date/time formats shall use ISO 8601 Data elements and interchange formats – Information interchange. Example: 2016-11-03T19:37:00Z
event_status Enumeration; Text The status of the event Optional See Event Status Enumerated Type
total_num_lanes Integer The total number of lanes associated with the road segment designated by the event geometry Optional A segment is a part of a roadway in a single direction designated the event geometry
vehicle_impact Enumeration; Text The impact to vehicular lanes along a single road in a single direction Required See Vehicle Impact Enumerated Type
workers_present Boolean A flag indicating that there are workers present in the event space Optional  
reduced_speed_limit Integer The reduced speed limit posted within the event space Optional  
restrictions Enumumeration; Text Zero or more road restrictions applying to the work zone road segment associated with the work zone delimited by semicolons Optional These are included as flags rather than detailed restrictions. Detailed restrictions are coded to specific lanes in the lane_restrictions table. See Road Restriction Enumerated Type
description Text Short free text description of work zone Optional This will be populated with formal phrases in a later WZDx version
creation_date DateTime The UTC time and date when the activity or event was created Optional All date/time formats shall use ISO 8601 Data elements and interchange formats – Information interchange. Example: 2016-11-03T19:37:00Z
update_date DateTime The UTC time and date when the activity or event was updated Optional All date/time formats shall use ISO 8601 Data elements and interchange formats – Information interchange. Example: 2016-11-03T19:37:00Z
sknick-iastate commented 4 years ago

A couple comments after just briefly looking through

DeraldDudley commented 4 years ago

A couple comments after just briefly looking through

  • I think version should stay in the road_event_feed_info table. I say this because would we expect that a single feed could have multiple different spec versions? From my understanding their would only be one road_event_feed_info but their could be multiple road_event_source_info in a single feed.

A business rule is needed; Do we restrict feeds to a single version?

  • The beginning_accuracy, ending_accuracy, start_date_accuracy and end_date_accuracy all need to stay in the road_events table. These can vary from project to project. There would be one project within an agency that only has estimated coordinates while another project has been verified. Got it. I'll fix it.
sknick-iastate commented 4 years ago

I don't think it hurts to discuss here on whether a feed should only contain a single version since this would be causing that problem. I don't know what the appropriate answer so open to others thoughts.

lynnerandolph commented 4 years ago

mmm...my initial feeling is a feed should have just one version. I tried to then think, what if you're collecting feeds from various places into one, but even then, I feel the collecting piece should do a transform so all the feeds are then the same version?

DeraldDudley commented 4 years ago

Version 2 of the draft metadata solution accommodating multiple data sources per feed.

geojson example illustrating proposed metadata changes: https://gist.github.com/DeraldDudley/21f65d04dea437d76d1c593cc181644e

Updated Road Event Feed Info object road_event_feed_info

Proposed Field Proposed Table Original Field Original Table
feed_info_id road_event_feed_info feed_info_id road_event_feed_info
feed_publisher road_event_feed_info issuing_organization metadata
feed_contact_name road_event_feed_info contact_name metadata
feed_contact_email road_event_feed_info contact_email metadata
feed_update_frequency road_event_feed_info datafeed_frequency_update metadata
feed_update_date road_event_feed_info feed_update_date road_event_feed_info
version* road_event_feed_info version road_event_feed_info

New Road Event Source Info object road_event_source_info

Proposed Field Proposed Table Original Field Original Table
source_info_id road_event_source_info NA NA
feed_info_id road_event_feed_info NA NA
source_organization road_event_source_info issuing_organization road_events
source_contact_name road_event_source_info contact_name metadata
source_contact_email road_event_source_info contact_email metadata
source_update_frequency** road_event_source_info datafeed_frequency_update metadata
source_update_date road_event_source_info feed_update_date road_event_feed_info
location_verify_method road_event_source_info location_verify_method metadata
location_method road_event_source_info location_method metadata
lrs_type road_event_source_info lrs_type metadata
lrs_url road_event_source_info lrs_url metadata
version*** road_event_source_info version road_event_feed_info

version will be placed in the road_event_feed_info table if a feed can only contain one version of the specification. • possible business rule: Aggregate feeds must be updated as frequently as when source data • version will be placed in the road_event_source_info table if a feed can only contain one version of the specification.

Updated Road Event object road_events

Field Name Data Type Description Conformance Notes
road_event_id ID A unique identifier issued by the data feed provider to identify the work zone project or activity Required Primary Key
source_info_id ID Identifies the source to which a road event is related. Required Foreign Key to road_event_source_info
subidentifier ID A unique identifier issued by data feed provider that provides additional references to project or activity Optional This identifier may be used in more than one feed as a reference to an agency project number or permit ID
geometry_type Enumeration: Multipoint or LineString May be represented as a linestring or a multipoint as defined in the GeoJson specification. Required  
Geometry Coordinate(s); Float A coordinate pair or an array of coordinates. In either case, the first coordinate is the beginning point and the last coordinate is the ending point of the road event Required Coordinate pairs and coordinate arrays are formatted according to the geoJson spec
road_name Text Publicly known name of the road on which the event occurs. Required  
road_number Text The road number designated by a jurisdiction such as a county, state or interstate Optional Examples I-5, VT 133
Direction Enumeration; Text The digitization direction of the road that is impacted by the event. This value is based on the standard naming for US roadways and indicates the direction the traffic flow regardless of the real heading angle. Required Example northbound (for I-5 North); See Direction Enumerated Type
beginning_cross_street Text Name or number of the nearest cross street along the roadway where the event begins Optional  
ending_cross_street Text Name or number of the nearest cross street along the roadway where the event ends Optional  
beginning_milepost Float The linear distance measured against a milepost marker along a roadway where the event begins Optional A milepost or mile marker is a surveyed distance posted along a roadway measuring the length (in miles or tenth of a mile) from the south west to the north east. These markers are typically notated on State and local government digital road networks. Provide link to description of milepost method in metadata file.
ending_milepost Float The linear distance measured against a milepost marker along a roadway where the event ends Optional A milepost or mile marker is a surveyed distance posted along a roadway measuring the length (in miles or tenth of a mile) from the south west to the north east. These markers are typically notated on State and local government digital road networks. Provide link to description of milepost method in metadata file.
beginning_accuracy Enum: Estimated or Verified Indicates how the beginning coordinate was defined. Required see Spatial Verification Enumerated Type
ending_accuracy Enum: Estimated or Verified Indicates how the ending coordinate was defined. Required see Spatial Verification Enumerated Type
start_date DateTime The UTC time and date when the event begins. Required All date/time formats shall use ISO 8601 Data elements and interchange formats – Information interchange. Example: 2016-11-03T19:37:00Z
end_date DateTime The UTC time and date when the event ends. Required All date/time formats shall use ISO 8601 Data elements and interchange formats – Information interchange. Example: 2016-11-03T19:37:00Z
start_date_accuracy Enum: Estimated or Verified A measure of how accurate the start Date Time is. Required see Time Verification Enumerated Type
end_date_accuracy Enumeration: Estimated or Verified A measure of how accurate the end Date Time is. Required see Time Verification Enumerated Type
event_status Enumeration; Text The status of the event Optional See Event Status Enumerated Type
total_num_lanes Integer The total number of lanes associated with the road segment designated by the event geometry Optional A segment is a part of a roadway in a single direction designated the event geometry
vehicle_impact Enumeration; Text The impact to vehicular lanes along a single road in a single direction Required See Vehicle Impact Enumerated Type
workers_present Boolean A flag indicating that there are workers present in the event space Optional  
reduced_speed_limit Integer The reduced speed limit posted within the event space Optional  
Restrictions Enumumeration; Text Zero or more road restrictions applying to the work zone road segment associated with the work zone delimited by semicolons Optional These are included as flags rather than detailed restrictions. Detailed restrictions are coded to specific lanes in the lane_restrictions table. See Road Restriction Enumerated Type
Description Text Short free text description of work zone Optional This will be populated with formal phrases in a later WZDx version
creation_date DateTime The UTC time and date when the activity or event was created Optional All date/time formats shall use ISO 8601 Data elements and interchange formats – Information interchange. Example: 2016-11-03T19:37:00Z
update_date DateTime The UTC time and date when the activity or event was updated Optional All date/time formats shall use ISO 8601 Data elements and interchange formats – Information interchange. Example: 2016-11-03T19:37:00Z
DeraldDudley commented 4 years ago

Draft Entity Relationship Diagram for Metadata https://github.com/DeraldDudley/wzd_geoJson/blob/master/road_event_erd_v3_metadata.jpg

j-d-b commented 4 years ago

Resolved in #117