conda-forge / qgis-feedstock

A conda-smithy repository for qgis.
BSD 3-Clause "New" or "Revised" License
41 stars 26 forks source link

Support for geoparquet #386

Closed m-kuhn closed 7 months ago

m-kuhn commented 7 months ago

Checklist

conda-forge-webservices[bot] commented 7 months ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

SrNetoChan commented 7 months ago

@conda-forge-admin, please rerender

SrNetoChan commented 7 months ago

@gillins we forgot to bump the build version

gillins commented 7 months ago

@gillins we forgot to bump the build version

Oh no! I'll prepare a new PR.

gillins commented 7 months ago

Actually, @m-kuhn would we be better off just getting users to install libgdal-arrow-parquet themselves if they want this functionality? This change doesn't alter the way that qgis is built does it? Forcing all the optional GDAL drivers to be installed with QGIS find of defeats the purpose of having them as optional drivers in the first place....

m-kuhn commented 7 months ago

Actually, @m-kuhn would we be better off just getting users to install libgdal-arrow-parquet themselves if they want this functionality? This change doesn't alter the way that qgis is built does it? Forcing all the optional GDAL drivers to be installed with QGIS find of defeats the purpose of having them as optional drivers in the first place....

I think it should be easy for a user to get a QGIS installation that supports most formats. I don't know what options there are (like --install-recommened or gdal-extra package or ...).

SrNetoChan commented 7 months ago

I have been wanting to move my blog post about qgis with conda to qgis.org installation webpage. In it, we can list available drivers in it.

gillins commented 7 months ago

The original discussion about why it was an optional driver for GDAL is here: https://github.com/conda-forge/gdal-feedstock/issues/628

SrNetoChan commented 7 months ago

@gillins am I correct that adding this permanently can also mean that qgis build needs to wait for arrow-parquet lib to be compatible to bump gdal version when needed?

SrNetoChan commented 7 months ago

I mean, qgis-feedstock gets new "dependencies".

gillins commented 7 months ago

Sorry, my timezone isn't the best for chatting with you guys.... I'm tempted to take this out, as @SrNetoChan points out this is another dependency to rebuild. And the whole point of the GDAL plugin decision was ensure the user only had to install "heavy" dependencies if they actual needed them. I know qgis is already pretty heavy but those of us on slow internet (like me!) appreciate it not getting any larger if possible. It would be a different story if having this available at build time changed how qgis was built but AFAICT this isn't the case. An alternative suggestion would be to add a message at install time (see https://docs.conda.io/projects/conda-build/en/stable/resources/link-scripts.html) with information about useful optional packages? Would this be ok? Some other packages already use this feature.

SrNetoChan commented 7 months ago

@gillins @m-kuhn sounds like a good compromise. We can even have a larger list of libraries and packages that are commonly installed with qgis (in osgeo4w for example).

gillins commented 7 months ago

Do you want to send me some wording with a list of optional packages and their uses? I can do it as part of #387.

SrNetoChan commented 7 months ago

Uhmm... I will have to study it.

On Tue Nov 21, 2023, 12:23 AM GMT, Sam Gillingham @.***> wrote:

Do you want to send me some wording with a list of optional packages and their uses? I can do it as part of #387 https://github.com/conda-forge/qgis-feedstock/pull/387. — Reply to this email directly, view it on GitHub https://github.com/conda-forge/qgis-feedstock/pull/386#issuecomment-1820019531, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3QU6OO7ISTZDRP26ZS7YLYFPYBNAVCNFSM6AAAAAA7Q3JLW2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRQGAYTSNJTGE. You are receiving this because you were mentioned.Message ID: @.***>

mtravis commented 7 months ago

Thanks for looking into this. I'm not that clued up to how Conda packages are built but if GDAL 3.5 ships with (geo)parquet / arrow and so do recent versions of QGIS as a result of this then this just isn't another optional package is it?

Also, more and more datasets are started to be produced as geoparquet other software is able to open it so I feel like this is a bit of a blocker for QGIS users on MAC and Linux.

Worth noting that there is also funding available to help get this fixed if needed.

gillins commented 7 months ago

Conda GDAL doesn't ship with parquet (see https://github.com/conda-forge/gdal-feedstock/issues/628), it's an optional plugin the user has to install if they want it (conda install libgdal-arrow-parquet). They felt it was too large a dependency to force on everyone who uses GDAL. Anyone who needs parquet will have to install the plugin manually, but this shouldn't be too hard for conda users (noting that it is rare for anyone to just need QGIS without other packages for their scripting needs)....

I kind of feel that this situation is the same for any package that uses GDAL, and maybe this conversation should move to that feedstock.

However, I would point out that if users want to install QGIS with all the bells and whistles installed without having to think too much, there are installers from qgis.org that fill that need. QGIS on conda is more for people who are comfortable with identifying installing packages they need and installing them. I would say most conda users are Python programmers too so will need a variety of packages to make their scripts run.

Which brings me to another point. Just because some users may need, for example scipy for their scripting, should we install that too with QGIS "just in case"? Where do you stop? @SrNetoChan is putting together a list of useful optional packages that "may work well with QGIS" for us to display on installation, hopefully this may assist the novice user.

I feel we should be taking the lead from the GDAL feedstock maintainers on this. But happy to hear what everybody thinks.

Personally have have zero data in parquet format and definitely wouldn't appreciate the extra download on my slow metered internet each time I install QGIS, but perhaps I'm in the minority here....

mtravis commented 7 months ago

@gillins thanks for the detailed explanation

I believe there is a fundamental difference here in that Geoparquet is a stable data format recognised by the OGC and therefore the need for this package is very different to that of scipy as it's not for scripting but for loading cloud-native data, such as the Overture data that is not being released as geoparquet.

However I do think this needs to go upstream and I think it would makes sense for the issue to be dealt at the GDAL level. I will post something there.