geopandas / pyogrio

Vectorized vector I/O using OGR
https://pyogrio.readthedocs.io
MIT License
259 stars 22 forks source link

Add support for force_2d with use_arrow=True in read_dataframe #300

Closed theroggy closed 8 months ago

theroggy commented 9 months ago

reference #297

Once #304 is merged, an extra use_arrow check can be activated or the other way around.

theroggy commented 9 months ago

Should we add a note to the docstring for read_dataframe stating that shapely is used to force 2D geometries after reading when use_arrow is True?

At first sight, I don't think it adds a lot of useful information to the user whether the conversion is done by shapely or by gdal?

I think you could move use of shapely down to read_arrow (i.e., make shapely a requirement to use force_2d in read_arrow) and conditionally attempt to import shapely if force_2d is True and raise an ImportError if it isn't present.

Yes, it indeed seems possible. Do we want to keep the interface of read_arrow stable or is it possible to change it? The geometry column returned is now a wkb, but when the force_2d logic moves there, the wkb conversion to shapely object needs to be done there as well before shapely.force_2d can be applied. Do I convert back to wkb there... to do the conversion again in geopandas.py... or do I return shapely objects in read_arrow or do I add a parameter to choose what is being returned?

jorisvandenbossche commented 9 months ago

I think we want to keep read_arrow independent from shapely, so this function can be used in cases you don't need GEOS objects (so IMO it should certainly not return shapely objects).

Given that, I assume it is easier to keep this handling on the read_dataframe side, where we do a conversion to shapely/GEOS objects anyway.

brendan-ward commented 9 months ago

Whoops, I forgot about the conversion from WKB and thought we were already using shapely objects in read_arrow (not sure why I thought this, poor memory plus not reading the code again). Let's leave that as is, and do this only in read_dataframe.

jorisvandenbossche commented 8 months ago

Thanks!