Open cholmes opened 7 months ago
Good point. Also, someone asked today in the demo session whether we validate the geometries. Some more details von valid geometries would be good.
Yeah, probably should be clear in the spec exactly what we consider 'valid' / not valid. Also wonder if we should restrict the geometry type - no points or lines. Curious if people have opinions about multi-polygons - is it one field if it's two polygons? Or should we try to make sure it's one field per row.
Curious if people have opinions about multi-polygons - is it one field if it's two polygons?
Two considerations regarding this question, hoping to add context to the discussion:
JRC has a detailed opinion of how an agricultural field (Feature of Interest, FOI) should be defined in the CAP Area Monitoring System:
In the CbM system, the FOI has a digital representation, derived from the GSAA in combination with LPIS and uses a polygon as geometric primitive. [...] Identification of an individual FOI allows to aggregate smaller adjacent surfaces under a single use (i.e. same farmer or same crop for the year) into a single feature of interest
Source: Chapter 3.1.2, https://wikis.ec.europa.eu/download/attachments/86968800/JRC127678_final.pdf?version=1&modificationDate=1682601334749&api=v2
My conclusion is that the theory mandates polygons (single-part), but in practice, both polygons or multi-polygons may be appropriate for a given data exchange use-case.
Thanks for the insights @StefanBrand. It also leads directly to this question again: https://github.com/fiboa/specification/issues/27
I was wondering: Would it be meaningful to split the MultiPolygons that you get into single Polygons? Who assigns the IDs, i.e. is it possible to have unique IDs per Polygon after the split? Do the they regularly have different crops planted? ...
Another potential check that could be good to do is whether the 'area' and 'perimeter' (if declared) correspond to the geometries area.
The first question to answer: Should we disallow Points, LineStrings and their Multi-equivalents? Currently the schema allows them...
I lean towards yes, disallowing. I think some eudr people might eventually make the case for points, but I just sorta think it's not a 'boundary'. But open to someone articulating a use case before 1.0 and we can add things back.
Okay, there are a couple of things to define here:
More schematically:
shapely.is_valid
)More semantically:
With regards to validation, some of the checks make the validation much slower (e.g. the is_invalid on the geometries pretty much slows down validation by a factor of 2). So they should likely be somewhat opt-in/out.
We didn't get to this in the fiboa Semantics call, will need another meeting. @cholmes will set up a new meeting, e.g. bi-weekly around the general topic of fiboa/field boundary semantics.
An interesting idea from an email discussion - should we check to make sure that there are no overlapping geometries? Like make it a requirement that any fiboa collection should have 0 overlapping geometries?
Seems like it'd be useful for users. And you could see the cases where people try to merge two fiboa datasets, that it'd be nice to have an automatic check that would flag that there are overlaps, and it's not truly 'valid' until they are resolved. We could then put that into the validators, and the end result would likely be higher quality field boundary data. I could also see it being more of a 'warning' in a validator.
There are likely some good arguments for not doing increased checks like this, especially as hard requirements, but wanted to open the conversation. There also are probably some other geometry checks that might make sense.