opengeospatial / geoparquet

Specification for storing geospatial vector data (point, line, polygon) in Parquet
https://geoparquet.org
Apache License 2.0
825 stars 56 forks source link

Fix the validator so it can install more widely. #107

Closed cholmes closed 1 year ago

cholmes commented 2 years ago

Moving the schema to format-specs broke the validator: pip install . does not work with a symlink resource. It works with pip install -e ., but this is something we need to tackle, especially if we want to distribute the validator as a python package.

Originally posted by @Jesus89 in https://github.com/opengeospatial/geoparquet/pull/87#pullrequestreview-985956883

kylebarron commented 2 years ago

@Jesus89 I'm not able to reproduce this. Can you provide more information on what you see? I checked specifically before merge, and pip install . works for me. Also, we have been using a symlink in h3-py without any reported issues (including on all OSes across Windows, Mac, and Linux).

cd validator/python/
rm -rf env
virtualenv env
source ./env/bin/activate
pip install .
./env/bin/geoparquet_validator ../../scripts/nz-building-outlines.parquet

gives

Validating file...
This is a valid GeoParquet file.

You can validate that the symlink gets resolved by building the Python sdist or wheel and checking that the output is a hard file, not a soft link.

> python setup.py sdist
...
copying geoparquet_validator/schema.json -> geoparquet_validator-0.0.1/geoparquet_validator
...
> tar -ztf dist/geoparquet_validator-0.0.1.tar.gz
geoparquet_validator-0.0.1/
geoparquet_validator-0.0.1/PKG-INFO
geoparquet_validator-0.0.1/README.md
geoparquet_validator-0.0.1/geoparquet_validator/
geoparquet_validator-0.0.1/geoparquet_validator/__init__.py
geoparquet_validator-0.0.1/geoparquet_validator/schema.json
geoparquet_validator-0.0.1/geoparquet_validator.egg-info/
geoparquet_validator-0.0.1/geoparquet_validator.egg-info/PKG-INFO
geoparquet_validator-0.0.1/geoparquet_validator.egg-info/SOURCES.txt
geoparquet_validator-0.0.1/geoparquet_validator.egg-info/dependency_links.txt
geoparquet_validator-0.0.1/geoparquet_validator.egg-info/entry_points.txt
geoparquet_validator-0.0.1/geoparquet_validator.egg-info/requires.txt
geoparquet_validator-0.0.1/geoparquet_validator.egg-info/top_level.txt
geoparquet_validator-0.0.1/setup.cfg
geoparquet_validator-0.0.1/setup.py
kylebarron commented 2 years ago

Also it's still working on CI: https://github.com/opengeospatial/geoparquet/blob/56ea704399fd92a15cb3658408884f79d471da72/.github/workflows/scripts.yml#L23

cholmes commented 1 year ago

I can't reproduce this, and I don't think we're going to focus on the python one in the repo as our main validator to distribute.