usdot-jpo-ode / wzdx

The Work Zone Data Exchange (WZDx) Specification aims to make harmonized work zone data provided by infrastructure owners and operators (IOOs) available for third party use, making travel on public roads safer and more efficient through ubiquitous access to data on work zone activity.
Creative Commons Zero v1.0 Universal
89 stars 62 forks source link

Timestamps for Field Devices #253

Open sknick-iastate opened 2 years ago

sknick-iastate commented 2 years ago

Issue name: “Additional Timestamps for Field Devices”

Summary

The current version of the SwzDeviceFeed only has a single update_date which is located in the FieldDeviceCoreDetails object. The description of this field "The UTC time and date when the field device information was updated.". There have been previous conversations of having additional timestamps to indicate the time specific attributes have been updated such as the GPS.

In Iowa's smart arrow board protocol there were three primary timestamps including the lastContact, the gps.tried and the gps.sampled. We use the various timestamps for troubleshooting/identifying issues with the arrow boards and helped when we were doing the initial testing. For example, if the gps.tried is greater than the gps.sampled then we know there is potentially an issue with the GPS on the device which may result in having stale data. Having these separate allows us to still know the time of the GPS coordinate but also know it is still trying to get the GPS. The lastContact can also be used to identify how stale any of the data for the arrow board is separate from the GPS. For example, the battery levels may update which will change the last contact but that doesn’t mean the GPS had to updated at the same time. I’m not sure all of the timestamps are necessary and depends on what level of detail the end users want from the data.

At this point, it would be good to hear others opinions on whether additional timestamps such as the GPS timestamp would be beneficial for the SwzDeviceFeed or if these are needed for data consumers?

Dunge commented 2 years ago

This is a two sided knife, on one hand it is relevant information that could be helpful, on the other hand it could create a problem with adoption rate if the schema becomes too complex, some producers won't be able to fill that information, and I predict most consumers would only want a single timestamp to know if the device data is relevant and not having to dig deep and juggle with too many fields.

This is not limited to GPS, every single data point can have it's own date of the last value that is independent of the last communication, might it be vms messages, traffic sensor speeds, camera snapshot, arrow boards pattern, etc. So every of these fields could theoretically become and object containing the last update date and the value.

I see some additional value of doing this (as mentioned above, a device can communicate but have readings error on their internal sensors so the data is stale). But I also think this is information that remains mostly useful for a technical diagnostic from the equipment provider side making sure the equipment is functional in their own system, and not necessarily something that needs to be exposed and distributed to the public.

A simpler solution would be to use the existing device_status/status_messages fields to report a "GPS Error", or "Sensor failure" status when the data is stale instead of the consumer having to do validation based on the datetime for every values.

rdsheckler commented 2 years ago

While we have used the approach Skylar has described for many years we do need to leave room for these fields to be populated by the answer 'UNKNOWN' so that more players can participate.

Having known recent confirmation of the status of the equipment and the location of the equipment provides a measure of the reliability of the information. So, if the 'status' was updated 30 seconds ago and the GPS was calculated two minutes ago and reported 10 seconds ago we can determine that this is more reliable than if any of those times were 'UNKNOWN'. This allows the data consumer to make their own determination of the value of the data. If 'UNKNOWN' reporting times are viewed as unreliable there will be less value in that equipment and the market will push the data provider to make upgrades to their equipment.

benafischer94 commented 2 years ago

I see more issues with allowing a text string in a timestamp field rather than having it defined in the accompanying documentation that a provider may use the same timestamp for all timestamp fields if they cannot provide the resolution.

I do agree that leaving room for the provider to do some sane fallbacks makes sense. Just not at the expense of over complicating the consumer model. GPS processing in particular tends to be difficult just due to the way the message que dump gets handled by different manufacturers on data loss. Including at a minimum a feed_timestamp when the feed was created, a gps_timestamp that associates to the GPS sentence, and then a record_creation or similar named for when the processing system that publishes the feed receives the message should cover the majority of cases.

j-d-b commented 1 year ago

I was wondering how does a consumer of the device feed benefit from the GPS timestamp (location update date)? Currently with a single update date for each field device, a consumer can check the one date and then chose to update their display/stored version of the device from the feed if it is newer than the previous update date they had stored. With that process, having an additional timestamp for GPS has minimal value.

On the other hand, if the GPS unit on the device typically has it's own timestamp/sends that timestamp with each data message and device feed producers that are communicating with the devices are already handling that data, it makes sense to allow including it in the feed and consumers can do what they want with it.

Personally, I would lean towards no change/no addition of additional timestamps (just the single update_date property) until there is a clear consumer use for it, as it is simpler.

rdsheckler commented 1 year ago

I think it would be worth hearing from some consumers. But, from our experience, in the view of real-time results we can get the status of the device in seconds but good locations with low standard deviations can take at least several minutes and may be updated through error correction over a period of 30 minutes or so. As a result we update the location on it's own schedule separate from status changes. We use it for trouble shooting and error correction and maybe the users won't care but we are striving for sub- 3m accurate locations and that can take a couple of constellations to achieve.