Store SpatialRecord in Section Polygons

zwing99 commented 2 years ago

I humbly propose we consider changing how Spatial Records are stored in ADAPT. Out of tradition, we have stored these as sets of data correlated to offsets from the GPS sensor reading. So we only keep the GPS sensor at the sampling frequency and then use the offsets to calculate where to draw the "point" and then use additional metadata like sensor width and speed to construct a "covering" polygon for that sensor reading. This way of storing things has the benefit of being very space efficient and allowing the reader to correct errors in producing that data, as others like @strhea have pointed out to me.

That said, I think the benefits do not out weight the costs. First and foremost, this is "tricky" math whose interpretation of how to do it can be both OEM and FIS-specific. This data storage technique leads to many conditions in the code that adapt processes from different OEMs. Secondly, it leaves little accountability to who is right in interpreting the data when the standard is inconsistently filled.

I propose we take a leaf (pun intended) from Deere and us at Corteva, who have internally and independently developed a very similar format to Deere, and store the Spatial Records as section polygons. These "new" Spatial Records would have the benefit of being transparent. What I mean to say is that there is no interpretation error, and the polygon drawn covers the appropriate area. If there is a gap between polygons or overlap, that should be considered "correct" or "mishandled" by the OEM, but not something responsible for the consuming party to correct. While this might feel like a "loss of control," it enables an environment of clear accountability to the ADAPT standard. If it is not right, it is not right and should be obvious when the polygons are drawn in GIS tooling. The other significant advantage of this is that it lowers the bar of entry for FIS systems to become users of machine data, which in turn creates healthy economic competition to deliver the value that your FIS system can do with good data, not whether or not it can read the data correctly.

I expect this proposal will ruffle some feathers in with folks, and I hope I can spark a lively discussion by prosing it. I am not 100% sold on it, but after careful consideration, I think it would be a monumental step forward for ADAPT and FIS systems to share data.

knelson-farmbeltnorth commented 2 years ago

Thanks @zwing99 for writing up the proposal. As we've discussed, I think it has lots of merits. Adding some notes for discussion in no particular order.

-Foremost, it removes a lot of potential for variation in data modeling. One of the largest challenges with ADAPT today (following from ISOXML) is that it provides the data provider multiple ways to model the data in the interest of staying true to the source. The net effect is that the burden of transforming the data falls on the data consumer, who still needs to anticipate and handle all the variations, often with conditional logic based on data provider. There is much less burden all around if it is the data provider who makes transformation decisions with their own data.

-Removing much of the implement modeling from ADAPT removes ambiguity about the use and purpose of ADAPT vs. ISO11783-10. We've always stated that ADAPT is an FMIS-centric model vs. ISO11783 as a machine-centric model, but we've maintained a lot of the machine modeling due to the state of machine data in the early days of ADAPT.

-As we were spinning up this standardization effort, I recall making the point that ADAPT in its current form risked obsolescence from the processed formats that OEMs were beginning to serve from their cloud APIs. One key goal of the serialization effort was to be to try to head off a dozen different formats of processed data. While my initial thinking was that we would allow modeling processed polygons by simply stubbing out a machine and providing a polygon instead of a point (SpatialRecord has a polymorphic Geometry property), removing the variability makes things cleaner.

-The impact to storage size here is significant. This is probably the biggest challenge in making this change. Ever-increasing file size, after all, is one of problems we were trying to solve.

zwing99 commented 2 years ago

GREAT summary @knelson-farmbeltnorth of concerns with this proposal! One thing to consider on the last concern of space is that we have options to "compact" the data for transfer. They WILL NEVER be as ideal as the current ISO and ADAPT model but I think they can fill the gap enough to gain the benefits that this proposal receives. I have no hard evidence for this statement but should be testable.