Open cholmes opened 6 months ago
This should be based on the eurocrops dataset - information at https://github.com/maja601/EuroCrops/wiki/Estonia and download at https://zenodo.org/records/8229128/files/EE_2021.zip?download=1
In the converter use ec_ee for the name, as @m-mohr pointed out it'd be good to group the eurocrops ones.
Also I do think we eventually want a converter for the actual source data from the Estonia government, but we can make a new ticket for that.
Any updates on this @PowerChell ?
Hey sorry got pulled into other work and dropped off this. I can give @kyle-rasch what I have and have him finish it off!
EE has a self-intersecting polygon which makes the validator fail. Not sure how to handle this case.
That is frustrating. Are we unable to modify the shape file to deal with the self-intersecting polygon since it is coming from an outside source?
EE has a self-intersecting polygon which makes the validator fail. Not sure how to handle this case.
The 'ideal' to me for this is what I've been saying where a converter should be as simple as possible, and then we have 'tools' to clean it up / make it valid / add interesting stuff.
In the short term can we just put in custom code that fixes or removes the self-intersecting polygon? And we just document what we did to clean up the dataset.
A similar one that doesn't break validation but is clearly off is that the France eurocrops one has like 6 fields who have at least one of their points in Africa. It'd be nice to have a little tool that can find and fix things like that.
I'm just wondering whether it's our task to fix the data. Do we just make data follow the fiboa spec or do we actually want to "fix" the data (whatever that means)? We can implement something like fiboa fix ee.parquet
(or a converter option) that just runs https://shapely.readthedocs.io/en/2.0.6/reference/shapely.make_valid.html But which file would we publish to Source?
Do we just make data follow the fiboa spec
Does the fiboa spec currently allow self-intersecting polygons? If it doesn't I think I'd argue it should in some way- the specification aims to make field boundary data more usable and interoperable. I do think it should be easy to get data into 'not-quite valid fiboa' easily, like I do think a first converter should not do validation of self-intersecting polygons. Perhaps there is a second level of fiboa validation, that makes sure there is self intersections, perhaps also no intersection between polygons (at least ones that are on the same time range).
I think the experience we want of people downloading fiboa data is that it's 'always good'. So if we just have to publish one on source I'd put the fixed one. But I think ideally we'd put both, to make it clear that something was done to the data past what the original was published as.
In the short term I think it's also fine to just fix the data and put the one fixed version up...
We don't have requirements for the geometries yet, see also https://github.com/fiboa/specification/issues/28
This has been implemented, right? https://github.com/fiboa/cli/pull/94
The converter is complete, yes. There are still some minor topics open with regards to validity and Source publication.
Use public data from Estonia and:
Instructions available at https://github.com/fiboa/data/blob/main/HOWTO.md