Closed rouault closed 2 years ago
Ah, good catch .. (and a "damn it!" for myself). The file was created using the released geopandas (which already supported those parquet files), and I thought I checked everything that the we were compatible with the updated spec, but I missed the "schema_version" vs "version".
That's unfortunate (if I had noticed this before, I might have argued to use "schema_version" in the spec in this repo as well ..), so that means existing geopandas-written files will have the required "version" field missing.
@cholmes I will provide an updated file.
@jorisvandenbossche - I'd be ok to switch to schema_version if you wanted. Like we could cut a 0.2.0 release. Though I suppose that doesn't actually help the backwards compatibility with geopandas, since it'd be a new version number. But it could just be cleaner all around?
I was first thinking to propose that as well, but it would indeed not actually solve the issue if there is a reader that very strictly checks this.
If we cut a 0.2.0 release with "schema_version", reader implementations would still need to be flexible regarding a missing "version" field for 0.1.0 files written with released geopandas. If we don't change this, reader implementations will also need to be flexible regarding a missing "version" field for them to be able to read those old geopandas files.
So maybe the main advantage of doing a more rapid 0.2.0 release, is that it might result in implementations directly supporting that version (and less 0.1.0 files getting written). And that it gives a clear version bump for geopandas to adjust the metadata (I was now planning to do a quick single-patch release of geopandas to change "schema_version" to "version", but that would still result in some 0.1.0 files written by geopandas with the old field and some with the new field).
And if we do a quick 0.2.0 release, I think we can just choose whatever of "version" or "schema_version" that we think is the best name (as it doesn't matter that much for the compatibility story for 0.1.0 files)
So maybe the main advantage of doing a more rapid 0.2.0 release, is that it might result in implementations directly supporting that version (and less 0.1.0 files getting written). And that it gives a clear version bump for geopandas to adjust the metadata (I was now planning to do a quick single-patch release of geopandas to change "schema_version" to "version", but that would still result in some 0.1.0 files written by geopandas with the old field and some with the new field).
Yeah, that's what I was thinking - have the 0.2.0 instead of the quick single-patch release.
I also think that most readers won't need to strictly check the super early versions of the spec. They'll just not support it, and then that will be a push for anyone with data in older versions to upgrade. And it really is all just announced, with warnings that it may change, so I think it's unlikely there's very much data at all.
I've added a branch with the rename, just in case we want to change it for future versions: https://github.com/opengeospatial/geoparquet/commit/9d94152c6d0bba8f181efcb5377dcb741a4b63b7
I think we've maybe missed the window for a 'quick release' of 0.2.0?
If Joris has a good reason for schema_version other than just the backwards compatibility (which we didn't quite get right) then I'm open to it, but just 'version' under the 'geo' key seems clear enough to me, and given a choice I prefer shorter. But really don't feel strongly on this one. Just want to be sure we're changing for a good reason.
Agree, if there is no good reason we can keep it as it is and update the files. @jorisvandenbossche what do you think about it?
I’m happy to keep it as is, since I already have data written with “version” and I don’t think “schema_version” adds much over version.
Closing this, as the new nz-buildings-outlines sample now uses version, with 0.3
@cholmes Can you point to this v0.3 nz-buildings-outlines sample? The link on the README still points to the 0.1 version.
https://storage.googleapis.com/open-geodata/linz-examples/nz-buildings-outlines.parquet has the following 'geo' metadata value:
schema_version should be renamed as version