Clarification on geometry requirements

j-d-b commented 3 years ago

Taken from Discussion https://github.com/usdot-jpo-ode/wzdx/discussions/193

^{Originally posted by **Dunge** August 9, 2021} ## Summary The current geometry requirements are vague. It only states it needs to conform GeoJSON specifications, and is described as: > The geometry of the road event. The Geometry object's type property MUST be LineString or MultiPoint. LineString allows specifying the entire road event path and should be preferred. MultiPoint should be used when only the start and end coordinates are known. Questions: * Can an event be composed of a single point? i.e. we know the start, but not the end. Can we push the same coordinate twice in MultiPoint mode? * Does a LineString event needs to follow the road curvatures or can be an approximation? ## Motivation I will expend on the second question. When generating a LineString, we have a few options. Especially in the context of automating WZDx generation based on device GPS positions. 1. Draw on the map (the best case, require manual entry) 2. Automate and receive the path via mapping services like for example Google Maps based on the GPS waypoints. 3. Automate and generate lines going from one GPS position to the other without following the road curvature. We used the **second method** for a while, but found it to be too unstable since it can lead to incorrect information in multiple situations: * GPS is imprecise and appear in the other lane direction, or in a lane that goes near an highway exit. * Mapping service detect a road closure and returns a path trying to avoid it by taking a detour. * Mapping service is not aware of a temporary road placement and still use the normal road position. This lead to obviously incorrect results like this ![image](https://user-images.githubusercontent.com/6548184/128784092-1f138b20-10f2-498d-9400-63ca6dd4e3cb.png) So this lead us to change to a simpler and more direct solution, the **third method**. Only draw lines from one validated GPS position to the other. In this case as soon as the road have a bit of a curve, it will lead to lines "cutting" over places where there is no road: ![image](https://user-images.githubusercontent.com/6548184/128784360-9b84d983-bae0-485d-8173-dc8ed4246bce.png) I can see a similar representation in the TxDOT feed currently being published, where the workzone is on the I-635 but the LineString representation goes straight between two points: ![image](https://user-images.githubusercontent.com/6548184/128784568-392e9ac7-563b-4567-b95b-88fdcf2ce998.png) Is this an acceptable output? Of course, it depends on the feed consumers, and where they plan to display that geometry information. In some situations it could be accepted, but I believe on a public website this wouldn't look good, and for an automated vehicle it could be dangerous.. ## Proposed changes Unfortunately, we are better just to stop trying to publish a path that can't be confirmed to be correct. Since LineString geometry representation are kinda impossible to be attained automatically based solely on roadside devices GPS positions with a 100% certainty and following road curvatures, I believe MultiPoint would be a better fit in this case. This lead to my proposed change. Remove the line from above description "MultiPoint should be used when only the start and end coordinates are known.". I would also remove the rule requiring MultiPoint to be only composed of pair and would allow a single RoadEvent to be composed of as many MultiPoint as possible, as long as they are representing the same workzone event. The reason being that if you have a workzone with hundreds of sensor points, but the same information for the RoadEvent composing the whole workzone, it would be a bit irrational to split the workzone into hundreds of small events composed of start/stop pairs while the rest of the information remains the same and get duplicated leading to a huge output. It would be much better to have a single RoadEvent entry with the hundreds of points representing valid positions on the workzone, without any path between them.

j-d-b commented 3 years ago

To add to the concerns, the RoadEvent's beginning_accuracy and ending_accuracy, which seek to indicate if the start and end coordinates are "verified", only address the start and end, not the full geometry. Thus currently if a producer is providing a LineString RoadEvent, if the beginning_accuracy and ending_accuracy and verified but the LineString geometry is just a straight line between the start/end (not following the roadway, like the TxDOT example), it is not possible for consumers to know the accuracy of the LineString. As @Dunge says as well, the LineString can not be accurate enough to be directly by autonomous vehicles, and if it's just for being shown on a map, especially since most producers are mapping companies, they will probably render it using their own techniques anyways, not blindly display the LineString.

For making progress here, I think we need input from data consumers on what geometry they prefer:

An approximate LineString (maybe with confirmed points and straight lines between); or
MultiPoint with whatever density the producer has of confirmed points (even if just start/end)

And we need to discuss changes we can make even if we do not remove the LineString option from the WZDx spec. For example, we could simply change the geometry description to note:

The producer can use MultiPoint with any number of known points (not just the start and end)
LineString is not necessarily preferred

As for the single point event option, I have also heard interest in only specifying the start point, for example some producers just have a smart arrow board at the start, perhaps allowing the GeoJSON Point geometry would be a valuable addition, rather than indicating to use MultiPoint with the start/end at the same point (confusing), or estimating the end point. However, an estimated end point may be better than none at all (I want to hear more from members on this), which would be a counter-argument for allowing Point.

daylesworth commented 3 years ago

@j-d-b , thanks for raising this. As a WZDx consumer, I would like to see both LineString and MultiPoint supported and have the spec clarify the correct use of each, with no preference for one over the other. A LineString should contain enough points to show the road geometry and maybe we could provide some guidance on the number of recommended samples for typical road curvatures. I also agree that MultiPoint should allow any number of samples with the assumption that they're taken from sensors/measurements and don't necessarily show complete road geometry. I would render them differently based on the geometry type. I think MultiPoint also supports a single position (the GeoJSON spec says "array" without specifying a minimum size, and JSON arrays can contain single elements). It's not a problem to also support Point though, and maybe that's clearer.

rdsheckler commented 3 years ago

I have a question about the LineString.

We are likely to start marking the placement of multiple cones or other delineators that mark the boundary of the work zone. Something like every cone dropped or every-third cone. In this case the locations will create a line-string but even in perfect precision the will likely start on the right edge of a lane and move to the left as the encompass a lane.

Is there a concern for LineStrings that cross one or multiple lanes? I doubt anybody should believe in precision below 3m CEP but if we are talking about the future I thought I would ask.

j-d-b commented 3 years ago

@daylesworth great note regarding MultiPoint allowing a single point. Thus WZDx already technically supports only providing the start of a work zone, though that should be clarified.

Dunge commented 3 years ago

Thanks for the feedbacks.

You are right about beginning_accuracy and ending_accuracy. I wonder if it would be better to have a simple geometry_accuracy instead that represent the whole thing. In any case, I forgot about these fields existence before asking this question and they are directly related.

So the next step: Clarify the specs requirements of what is a "valid" geometry. It seems like we are in agreement to make the change so that MultiPoint can be composed from 1 to an infinite number of points instead of just pairs of 2. In the case of LineString, it would be great to clarify what makes one valid or not (does it needs to follow road curvature or not and what is the required precision of the segments length and positions).

@rdsheckler : A road event is related to a single direction of a road and contains specifications for the different lanes in that direction. That's why I believe the geometry should not follow a specific lane, but just be in the center of that direction (as Google Maps do). But in your case of cones narrowing from one side to the other of a lane/direction, I don't know. That's a bit part of the question of "is it valid or not". I would say it is, but only more input from feed consumers will confirm.

I got to say I'm a bit sad to abandon the LineString representation for my use case. That was the main "flashy" thing to show off when presenting the end result, you would have a line following the road all along the workzone. A list of point just doesn't have the same impact. So I'm asking the community if anyone have any hint or suggestion on how I could create a valid LineString automatically just from device positions. I know it's kinda impossible when a road doesn't exist on any mapping services (unless we make someone drive through the workzone and record his path, but then it would probably be simpler to just draw it), but we never know.

j-d-b commented 3 years ago

You are right about beginning_accuracy and ending_accuracy. I wonder if it would be better to have a simple geometry_accuracy instead that represent the whole thing. In any case, I forgot about these fields existence before asking this question and they are directly related.

I think it is important for the begin/end to be able to be "validated" separated as often the start and end, or just the start, is the only "verified" location. I think there is value in having the ability to specify the accuracy of the full geometry though, if it is more than just MultiPoint with start/end points.

I think the above ties into #129.

DeraldDudley commented 3 years ago

I like the idea of reporting spatial accuracy in the RoadEventDataSource Object. We could work with USDOT or FGDC spatial statistician's to develop a reporting mechanism that complies with Spatial Data Accuracy Standards.

E.g.: National Standard for Spatial Data Accuracy

rdsheckler commented 3 years ago

I think it would be good for everyone to see examples of what can be expected for geometric data in the coming five years. I think that everybody is familiar with entering data interactively the way you submit a road closure to Waze or another system. However, through years of experience we know that only 5% of the work zones will ever be reported in an interactive process by the traffic centers, the remaining will be reported by automated systems that are operated by the work crews on-site.

The majority of work zone data that is being reported today is through autonomous tracking of equipment of varying types. I don't believe that many people who are participating in this process have an appreciation for the nature of that equipment and what can and cannot be learned through these methods.

natedeshmukhtowery commented 3 years ago

@Dunge I'm fairly sure, but not 100% certain, that a FHWA-developed open source Work Zone Data Collection tool - https://github.com/TonyEnglish/Work_Zone_Data_Collection_Toolset - automatically generates LineString when outputting WZDx feeds as the user drives through a WZ. @TonyEnglish could confirm:)

DeraldDudley commented 3 years ago

In the end, what's most important is that WZs are located and described as accurately and clearly as possible. If we need to make the spec more flexible to reach those goals I'm all for it.

jacob6838 commented 3 years ago

This is my interpretation of the spec:

There is no lane level geometry, so all data should only be interpreted longitudinally (along the roadway), not laterally (side to side or by lane). Points do not have to be in the center of the roadway
LineString should be a set of coordinates which show the path of the work zone and are by definition connected by a line. From Wikipedia: "a curve specified by the sequence of points"
MultiPoint is a collection of points not necessarily connected by lines. By some interpretations a MultiPoint does not have a defined order

What we have done (Work Zone Data Collection Tool):

Collect breadcrumbs while driving through the work zone (1 Hz or 10Hz) with markers (lane 1 closed, workers present, ...) using GPS (1-5 meter accuracy)
Process/compress breadcrumbs to reduce number of points (j2945/1) based on a maximum error or 1 meter
Generate WZDx message from processed breadcrumbs as a LineString

sergebeaudry commented 3 years ago

I'm reading that we have today multiple data sources and each have pros and cons.

LineString: is the ultimate level but today it is labor intensive as it require human resources to make it perfect: Draw on a map, local survey, driving the jobs with computers/cameras, drone that fly over and digitize are all elements. Soem jobs will use it, I don't believe ALL jobs can afford this.
Multipoint: could bring some automation and could be as simple as start, start-end to more precise with lane closure, etc... Some consumer would want to know more than start and end.

Overtime the multipoint will become a lineString when enough point will be able to be generated. Here a potential opportunity for mapping companies to convert few MultiPoint to a Linestring with the Work Zone reality.

Those 2 exists today and could be published. As part of baby steps, I would propose to permit MultiPoint to support more than start and end and clearly define LineString versus MultiPoint in the spec.

sknick-iastate commented 3 years ago

Good discussion so far. I'm not sure if it helps but for those who weren't at the early meetings of the v2.0 update, Seattle DOT and Google gave a presentation of how they are using geometry information to snap to their network that is available here. It is only around 10 minutes and at the beginning of the meeting.

I agree with @jacob6838 that at this point the WZDx is intended for longitudinal representation with the lane information contained within the feed. Based on feedback from consumers in the past, the begin and end coordinates were the critical points they were interested in which is why those fields have the accuracy indicators. Along those lines, we avoided points (as well as polygons) as a geometry type based on feedback from consumers early on about the importance of the end point as well as having multiple points to snap to a network (discussed in the recording above). Having just a single point makes it more difficult to snap to a network accurately.

I'm interested in hearing more thoughts on the multipoint vs linestring discussion. To steal an image from @sergebeaudry, we plan on following a similar approach as shown here where planned work zone which have the geometry snapped to our network are used as a base then supplemented with field devices that would show the begin and end (at a minimum).

j-d-b commented 3 years ago

@sknick-iastate thanks for linking to the recording, it is extremely applicable and indicates why LineString is preferred.

Quotes from Sandra at Google from the recording:

Snapping is still incredible challenging We need the detailed polylines in each direction of travel

And a couple of screenshots from the presentation:

Noted takeaway points:

Don't use a polyline with only start/end and a long straight line between that doesn't follow the roadway at all—it is too unclear what the closure is referring to.
A LineString with some confirmed points is ideal and would be preferred to the option of MultiPoint with more than just the start and end, even if it doesn't exactly follow the curvature of the roadway, as the LineString indicates the direction/sequence over MultiPoint which is just a collection of points (not necessarily a sequence, as far as I could tell from the GeoJSON spec).

rdsheckler commented 3 years ago

Our experience with Google is a need for a very detailed linestring because if you don't match the point on a curve that they use then your tangents don't match theirs and the Google process doesn't work. Interestingly Waze doesn't have nearly the trouble.

As an example we had a project on a five mile curve through the desert with no other road for miles. We had six locations on the curve. Because the tangents at the locations that we had didn't match the Google tangents Google couldn't associate our locations with the only road around. I imagine if we had locations on every 100ft of road it would have worked but that is beyond the interest of the road crew.

If we are building a long term process for a large percentage of work zones we need to establish a practice that can be accomplished with the skills of the people in the field. Technology and good coding need to compensate for realities on the ground.

rdsheckler commented 3 years ago

I would like to have a live conversation about this topic. I think there is a compromise between what Google wants for snapping and a multipoint but some of it revolves around issues associated with cross-streets and entry ramps. Can somebody tell me how we are handling the non-primary routes that have geometric data coming from them?

j-d-b commented 1 year ago

Summarized guidance on using geometry based on co-chairs discussion 2022-10-07:

The order of coordinates is meaningful: the first coordinate is the first (furthest upstream) point a traveler encounters when traveling through the road event.
If a data producer has three or more coordinates that are on the road event path, LineString should be used because it indicates the points are ordered.
Use a higher density of points points for sections of roadways with curves so a data consumer can more easily match to a their road network.

j-d-b commented 1 year ago

Updates as summarized above were implemented for v4.2.

usdot-jpo-ode / wzdx

Clarification on geometry requirements #194

Taken from Discussion https://github.com/usdot-jpo-ode/wzdx/discussions/193