Open defuneste opened 2 years ago
Initially, I used bbox, but I used bounds for more general features.
I implemented the control with a polygon object here : https://github.com/datagistips/geo4TableSchema/blob/main/mds/BATCH-EXAMPLES.md#bounds
The questions are
Thx, I get it a bit better now.
is it useful to mention a polygon object as an extent control layer ?
I do not have a strong opinion on this. When would you want to test it against an other extend than the bbox ? It could be an interesting feature but then you will have to do it after testing the CRS or be sure to provide a "bounds" that match it.
if so, shoud we add an extra tag : bounds, in addition to bbox ? Or another name ? Or do we keep bounds for two types of extents ?
If bounds
is kept (with bounds different than bbox) then I prefer the option to keep it and add bbox
. Bounding box are an important feature in plenty of other GIS software and have plenty of methods/functions already defined that help.
ok
It could be an interesting feature but then you will have to do it after testing the CRS or be sure to provide a "bounds" that match it.
geovalidate needs that the source data and control bounds are in the same CRS because the CRS of source data must be the CRS specified in the crs
tag. If it is in the same CRS, it can compare the geometries mutually. We may need to know how to specify the CRS of geoCSV. I'll investigate that. But you have any idea ! ;-)
When would you want to test it against an other extend than the bbox ?
There can be cases where you want to control your data against a specific polygon : "is my data which focuses on Aix-en-Provence building IDs contained in Aix-en-Provence city contour ?"
But the problem is that it can make the json heavy..
So, it should be better to relate to an external resource. For instance, a reference geojson, like a BDTOPO one provided in the data package, or on a web server, or via an API, against which to control data.
I don't know if this kind of "remote" or external file control relationship is implemented in TableSchema. It would be nice to ask the question. What's your opinion on that ?
Something like :
{
"bounds":{
"resources":[
{
"title":"myCity",
"path":"http://reference-data/mycity.json"
}
]
}
}
What's your opinion on this ?
If bounds is kept (with bounds different than
bbox
) then I prefer the option to keep it and addbbox
.
Ok for bbox
! Sounds a good idea
No idea on geoCSV, I assumed it was like geojson with a default CRS.
Well I understand the problem a bit better now thx! I can see the pros of having a bounds
fields. As a geographer I like having a more precise information than the bbox on the scale. If the bounds
is greater than the bbox
we can infer there is no data outside of bbox
but this would be not because data are missing (I do not know if I make myself clear).
I will play the devil advocate (that doesn't means I am against the idea). It means that every time the administrative unit change someone have to update the data and the reference repertory/DB (something like STAC ?). This is not always a bad idea but this need to be done by the data provider (unclear if they want to do that). Currently this is mostly the job of the data user to assess the quality and check everything.
Two sides notes (if you want I can open new issues to make it cleaner). First one : Frictionless submitted a software peer reviews to Ropenscience for an R package (https://github.com/ropensci/software-review/issues/495). They have an empty check box "geospatial data". Second one: I think you have done a good job advertising this repo and your concern but not that many people came to give their opinion. How can it be improved? I think the GIS community do not have a strong culture of interacting in Github (or an other platform).
Hello,
I think
bbox
is more common and better name for bounding box.Maybe I miss the logic behind having a different name (but it look like a a bbox here: https://github.com/datagistips/geo4TableSchema/blob/main/f_geovalidate.py#L28-L43) if so it could be clarified.