geopandas / pyogrio

Vectorized vector I/O using OGR
https://pyogrio.readthedocs.io
MIT License
274 stars 23 forks source link

Data files license #433

Open avalentino opened 4 months ago

avalentino commented 4 months ago

I'm in the process of packaging pyogrio for Debian, I hope you are fine with it. TO meet the Debian packaging standards I need to report the license for all files included in the package. I would appreciate a lot if you could clarify what is the license of data files included in pyogrio/tests/fixtures, and in particular the license of:

The pyogrio/tests/fixtures/README.md seems to clarify what is the origin of some of the data files but the license for me is not clear. Can I safely assume that data files are provided with the same license of the source code (MIT)?

martinfleis commented 4 months ago

Can I safely assume that data files are provided with the same license of the source code (MIT)?

I don't think so. At least the OSM sample retains ODbL. I am not sure about the rest.

jorisvandenbossche commented 4 months ago
  • poly_not_enough_points.shp.zip

This was recently added in https://github.com/geopandas/pyogrio/pull/422. @theroggy did you create this file manually? (would be good to add a note about that in the README then as well)

  • test_fgdb.gdb.zip

This is downloaded from https://trac.osgeo.org/gdal/raw-attachment/wiki/FileGDB/. @rouault do you know if this wiki falls under the general GDAL license?

  • test_mixed_surface.gpkg

This is extracted from one of the datasets from https://www.usgs.gov/national-hydrography/access-national-hydrography-products. I don't directly find anything on that page about the license of those datasets (maybe the USGS has a general license it uses for all available datasets? but not familiar with it) EDIT: https://www.usgs.gov/faqs/what-are-terms-uselicensing-map-services-and-data-national-map says "public domain"

rouault commented 4 months ago
  • test_fgdb.gdb.zip

Maybe @jmckenna remembers the provenance of this file ? Otherwise you could potentially switch to one of the GDAL autotest suite samples: https://github.com/OSGeo/gdal/tree/master/autotest/ogr/data/filegdb

theroggy commented 4 months ago
  • poly_not_enough_points.shp.zip

This was recently added in #422. @theroggy did you create this file manually? (would be good to add a note about that in the README then as well)

No, I exported the polygon giving the issue from the file provided in this issue: https://github.com/geopandas/geopandas/discussions/3336

So, not sure about licensing :-(... @EwoutH can you shed some light?

EwoutH commented 4 months ago

Should have mentioned that, that file doesn't have a proper open-source license so I don't think it can be in there.

I got it on a project license, see https://mrdh.nl/verkeersmodel.

I think sharing it for debugging was already stretching it now I think of it (but probably ok).

avalentino commented 4 months ago

Thanks a lot to everybody for the help. To summarize the discussion please find below an excerpt of the debian/copyright file that I'm preparing:

Files: *
Copyright: 2020-2021, Brendan C. Ward and pyogrio contributors
License: Expat

Files: pyogrio/arrow_bridge.h
Copyright: 2020-2021, Brendan C. Ward and pyogrio contributors
License: Apache-2.0

Files: pyogrio/tests/fixtures/naturalearth_lowres/*
       pyogrio/tests/fixtures/test_mixed_surface.gpkg
Copyright: discalimed
License: public-domain

Files: pyogrio/tests/fixtures/sample.osm.pbf
Copyright: OpenStreetMap contributors
License: OBdL-1.0

For the time being:

For the other files in pyogrio/tests/fixtures (I mean the one not mentioned in the above debian/copyrigtht file excerpt) it is assume the same license of the source code.

Please feel free to comment if there is anything that looks incorrect.

brendan-ward commented 4 months ago

@avalentino the license for pyogrio is MIT, not Expat. Also, we're out of date, but please make the copyright extend through 2024 (will submit a PR to fix here shortly).

I believe arrow_bridge.h is derived from the Arrow project and should preserve their license / copyright statement rather than ours. Unfortunately, I'm not easily finding a copyright statement for that specific file, and the top-level LICENSE.txt for Arrow includes many copyright statements for code derived from different sources. (not sure what to do in this case re: copyright)

If you can give us a few more days, I can try to create some alternative test files that sidestep licensing issues.

avalentino commented 4 months ago

@avalentino the license for pyogrio is MIT, not Expat. Also, we're out of date, but please make the copyright extend through 2024 (will submit a PR to fix here shortly).

According to the Debian documentation (e.g. 1 and 2) MIT and Expat should be equivalent in most of the cases and the recommendation (for a metter of homogeneity within Debian) is to use the Expat name when the text of the license matches the Expat one. Of course it is not a big issue to change the name if it matters for you but, in any case, the text of the license is reported in the same debian/copyright file, I have just reported an excerpt for brevity.

[1] https://www.debian.org/legal/licenses/ [2] https://www.debian.org/legal/licenses/mit

I believe arrow_bridge.h is derived from the Arrow project and should preserve their license / copyright statement rather than ours. Unfortunately, I'm not easily finding a copyright statement for that specific file, and the top-level LICENSE.txt for Arrow includes many copyright statements for code derived from different sources. (not sure what to do in this case re: copyright) If you can give us a few more days, I can try to create some alternative test files that sidestep licensing issues.

Absolutely no rush. Please take your time, and thank you for supporting me.

brendan-ward commented 4 months ago

@avalentino per #441, I've removed the test files with problematic licenses. Some of our maintainers are out of office right now, so we're not quite ready to merge this in yet. Might be another week or two.

avalentino commented 4 months ago

Thanks for the update @brendan-ward

QuLogic commented 2 months ago

On a related note, does arrow_bridge.h actually need to be installed? It seems like it probably should only be for extension building purposes, but it ends up in the installed copy too.

brendan-ward commented 2 months ago

arrow_bridge.h needs to go into the source distribution; do you mean exclude it from the wheels?

QuLogic commented 2 months ago

Yes, I do mean the wheels; I opened #463 to remove it and the Cython files, which seem accidental.