Open avalentino opened 4 months ago
Can I safely assume that data files are provided with the same license of the source code (MIT)?
I don't think so. At least the OSM sample retains ODbL. I am not sure about the rest.
- poly_not_enough_points.shp.zip
This was recently added in https://github.com/geopandas/pyogrio/pull/422. @theroggy did you create this file manually? (would be good to add a note about that in the README then as well)
- test_fgdb.gdb.zip
This is downloaded from https://trac.osgeo.org/gdal/raw-attachment/wiki/FileGDB/. @rouault do you know if this wiki falls under the general GDAL license?
- test_mixed_surface.gpkg
This is extracted from one of the datasets from https://www.usgs.gov/national-hydrography/access-national-hydrography-products. I don't directly find anything on that page about the license of those datasets (maybe the USGS has a general license it uses for all available datasets? but not familiar with it) EDIT: https://www.usgs.gov/faqs/what-are-terms-uselicensing-map-services-and-data-national-map says "public domain"
- test_fgdb.gdb.zip
Maybe @jmckenna remembers the provenance of this file ? Otherwise you could potentially switch to one of the GDAL autotest suite samples: https://github.com/OSGeo/gdal/tree/master/autotest/ogr/data/filegdb
- poly_not_enough_points.shp.zip
This was recently added in #422. @theroggy did you create this file manually? (would be good to add a note about that in the README then as well)
No, I exported the polygon giving the issue from the file provided in this issue: https://github.com/geopandas/geopandas/discussions/3336
So, not sure about licensing :-(... @EwoutH can you shed some light?
Should have mentioned that, that file doesn't have a proper open-source license so I don't think it can be in there.
I got it on a project license, see https://mrdh.nl/verkeersmodel.
I think sharing it for debugging was already stretching it now I think of it (but probably ok).
Thanks a lot to everybody for the help.
To summarize the discussion please find below an excerpt of the debian/copyright
file that I'm preparing:
Files: *
Copyright: 2020-2021, Brendan C. Ward and pyogrio contributors
License: Expat
Files: pyogrio/arrow_bridge.h
Copyright: 2020-2021, Brendan C. Ward and pyogrio contributors
License: Apache-2.0
Files: pyogrio/tests/fixtures/naturalearth_lowres/*
pyogrio/tests/fixtures/test_mixed_surface.gpkg
Copyright: discalimed
License: public-domain
Files: pyogrio/tests/fixtures/sample.osm.pbf
Copyright: OpenStreetMap contributors
License: OBdL-1.0
For the time being:
poly_not_enough_points.shp.zip
and skip the associated test_read_invalid_shp
test. At least until teh situation is clarifiedtest_fgdb.gdb.zip
, apparently, the situation is still not totally clear, I can remove it as well for the moment. This implies skipping at least 8 additional testsFor the other files in pyogrio/tests/fixtures
(I mean the one not mentioned in the above debian/copyrigtht
file excerpt) it is assume the same license of the source code.
Please feel free to comment if there is anything that looks incorrect.
@avalentino the license for pyogrio is MIT, not Expat. Also, we're out of date, but please make the copyright extend through 2024 (will submit a PR to fix here shortly).
I believe arrow_bridge.h
is derived from the Arrow project and should preserve their license / copyright statement rather than ours. Unfortunately, I'm not easily finding a copyright statement for that specific file, and the top-level LICENSE.txt for Arrow includes many copyright statements for code derived from different sources. (not sure what to do in this case re: copyright)
If you can give us a few more days, I can try to create some alternative test files that sidestep licensing issues.
@avalentino the license for pyogrio is MIT, not Expat. Also, we're out of date, but please make the copyright extend through 2024 (will submit a PR to fix here shortly).
According to the Debian documentation (e.g. 1 and 2) MIT and Expat should be equivalent in most of the cases and the recommendation (for a metter of homogeneity within Debian) is to use the Expat name when the text of the license matches the Expat one. Of course it is not a big issue to change the name if it matters for you but, in any case, the text of the license is reported in the same debian/copyright
file, I have just reported an excerpt for brevity.
[1] https://www.debian.org/legal/licenses/ [2] https://www.debian.org/legal/licenses/mit
I believe arrow_bridge.h is derived from the Arrow project and should preserve their license / copyright statement rather than ours. Unfortunately, I'm not easily finding a copyright statement for that specific file, and the top-level LICENSE.txt for Arrow includes many copyright statements for code derived from different sources. (not sure what to do in this case re: copyright) If you can give us a few more days, I can try to create some alternative test files that sidestep licensing issues.
Absolutely no rush. Please take your time, and thank you for supporting me.
@avalentino per #441, I've removed the test files with problematic licenses. Some of our maintainers are out of office right now, so we're not quite ready to merge this in yet. Might be another week or two.
Thanks for the update @brendan-ward
On a related note, does arrow_bridge.h
actually need to be installed? It seems like it probably should only be for extension building purposes, but it ends up in the installed copy too.
arrow_bridge.h
needs to go into the source distribution; do you mean exclude it from the wheels?
Yes, I do mean the wheels; I opened #463 to remove it and the Cython files, which seem accidental.
I'm in the process of packaging pyogrio for Debian, I hope you are fine with it. TO meet the Debian packaging standards I need to report the license for all files included in the package. I would appreciate a lot if you could clarify what is the license of data files included in
pyogrio/tests/fixtures
, and in particular the license of:poly_not_enough_points.shp.zip
sample.osm.pbf
test_fgdb.gdb.zip
test_mixed_surface.gpkg
The
pyogrio/tests/fixtures/README.md
seems to clarify what is the origin of some of the data files but the license for me is not clear. Can I safely assume that data files are provided with the same license of the source code (MIT)?