conda-forge / gdal-feedstock

A conda-smithy repository for gdal.
BSD 3-Clause "New" or "Revised" License
30 stars 62 forks source link

Add parquet-cpp dependency #628

Closed akrherz closed 1 year ago

akrherz commented 2 years ago

Non-blocking but could be nice to have for future iterations:

Originally posted by @rouault in https://github.com/conda-forge/gdal-feedstock/issues/627#issuecomment-1126578502

jorisvandenbossche commented 2 years ago

@rouault in #627 you also mentioned:

possibly using driver-as-plugin for select drivers that could go in gdal-XXXX subpackages for drivers that drag heavy dependencies

Would that be possible for the parquet driver as well? Because if so, I would very much like to see that, given the huge amount of dependencies that the arrow-cpp package (from parquet-cpp) would bring in (this package could in theory also be splitted, but isn't at the moment)

rouault commented 2 years ago

Would that be possible for the parquet driver as well?

yes, my own builds have OGR_ENABLE_DRIVER_PARQUET_PLUGIN:BOOL=ON and OGR_ENABLE_DRIVER_ARROW_PLUGIN:BOOL=ON

rouault commented 1 year ago

Proposed PR to add a libgdal-arrow-parquet package with the Arrow and Parquet drivers built as plugin: https://github.com/conda-forge/gdal-feedstock/pull/679

mtravis commented 9 months ago

Is it possible to revisit this now geoparquet is being more widely adopted. See https://github.com/conda-forge/qgis-feedstock/pull/386#issuecomment-1826778684

rouault commented 9 months ago

Is it possible to revisit this

Revisit what exactly ? I think the approach for less monolothic GDAL builds, which is going to go further with GDAL 3.9 with https://gdal.org/development/rfc/rfc96_deferred_plugin_loading.html, is the way to go to avoid people saying "GDAL is a too big beast for my small needs". Everything is a matter of compromise. "libgdal-arrow-parquet" can be easily installed. If we add more drivers as plugins, we could likely have a "libgdal-all" meta package (like Alpine which has a gdal-driver-all package: https://pkgs.alpinelinux.org/package/edge/community/aarch64/gdal-driver-all). And QGIS could have similarly a "qgis-full" meta package that would depend on "libgdal-all". Potentially someone could contribute an improvement to QGIS where when it is a QGIS conda-forge build and you open a Parquet file, it would propose in the GUI to install libgdal-arrow-parquet.

Cf in RFC96:

For example, if doing a build with::

    cmake .. -DOGR_DRIVER_PARQUET_PLUGIN_INSTALLATION_MESSAGE="You may install it with with 'conda install -c conda-forge libgdal-arrow-parquet'"

and opening a Parquet file while the plugin is not installed will display the
following error::

    $ ogrinfo poly.parquet
    ERROR 4: `poly.parquet' not recognized as a supported file format. It could have been recognized by driver Parquet, but plugin ogr_Parquet.so is not available in your installation. You may install it with with 'conda install -c conda-forge libgdal-arrow-parquet'
mtravis commented 9 months ago

@rouault appreciate your feedback. I haven't fully understood the complexity of this. I've never made QGIS from source myself but happy to learn and help out if I can.

@gillins Is it yourself who maintains the QGIS conda builds? Apologies if I'm not correct terminology but I'm relatively new to conda and only used it for installing GDAL/QGIS

gillins commented 9 months ago

@mtravis it's myself and @SrNetoChan who do most of the work, yes. Other people have contributed in the past too and happy to have others join.

I'm quite keen on @rouault's approach of slimming down the default GDAL install and having things as plugins. I also like the qgis-full idea and just allowing qgis to be the minimal system for people who don't need every format. Perhaps we can talk about having a metapackage for this and libgdal-all?