geopandas / pyogrio

Vectorized vector I/O using OGR
https://pyogrio.readthedocs.io
MIT License
276 stars 24 forks source link

Reading a VRT containing `OGRVRTWarpedLayer` does not always perform reprojection with `use_arrow=True` #501

Closed program-- closed 6 hours ago

program-- commented 8 hours ago

I'm not sure if this is a GDAL or pyogrio issue (or just user error). I couldn't find an existing issue here or in GDAL's repo, though its very likely I just completely missed it 😅. In all cases, the CRS metadata is correct, but the geometry itself may not be projected.

Interestingly, providing a SQL query does produce the correct result when using Arrow.

Example VRT using naturalearth_lowres placed in pyogrio/tests/fixtures/:

<OGRVRTDataSource>
    <OGRVRTWarpedLayer>
        <OGRVRTLayer name="naturalearth_lowres">
            <SrcDataSource relativeToVRT="true">naturalearth_lowres</SrcDataSource>
        </OGRVRTLayer>
        <TargetSRS>EPSG:3857</TargetSRS>
    </OGRVRTWarpedLayer>
</OGRVRTDataSource>

I am using pyogrio 0.10.0 with GDAL 3.10.0.

Examples

:heavy_check_mark: pyogrio.read_dataframe(use_arrow=False, sql=None) ```python pyogrio.read_dataframe("naturalearth_lowres.vrt", use_arrow=False) #> pop_est continent name iso_a3 gdp_md_est geometry #> 0 920938 Oceania Fiji FJI 8374.0 MULTIPOLYGON (((20037508.343 -1812498.413, 200... #> 1 53950935 Africa Tanzania TZA 150600.0 POLYGON ((3774143.866 -105758.362, 3792946.708... #> 2 603253 Africa W. Sahara ESH 906.5 POLYGON ((-964649.018 3205725.605, -964597.245... #> 3 35623680 North America Canada CAN 1674000.0 MULTIPOLYGON (((-13674486.249 6274861.394, -13... #> 4 326625791 North America United States of America USA 18560000.0 MULTIPOLYGON (((-13674486.249 6274861.394, -13... #> .. ... ... ... ... ... ... #> 172 7111024 Europe Serbia SRB 101800.0 POLYGON ((2096126.508 5765757.958, 2096127.988... #> 173 642550 Europe Montenegro MNE 10610.0 POLYGON ((2234260.104 5249565.284, 2204305.52 ... #> 174 1895250 Europe Kosovo -99 18490.0 POLYGON ((2292095.761 5139344.949, 2284604.344... #> 175 1218208 North America Trinidad and Tobago TTO 43570.0 POLYGON ((-6866186.192 1204901.071, -6802177.4... #> 176 13026129 Africa S. Sudan SSD 20880.0 POLYGON ((3432408.751 390883.649, 3334408.389 ... #> #> [177 rows x 6 columns] ```
:x: pyogrio.read_dataframe(use_arrow=True, sql=None) ```python pyogrio.read_dataframe('naturalearth_lowres.vrt', use_arrow=True) #> pop_est continent name iso_a3 gdp_md_est geometry #> 0 920938 Oceania Fiji FJI 8374.0 MULTIPOLYGON (((180 -16.067, 180 -16.555, 179.... #> 1 53950935 Africa Tanzania TZA 150600.0 POLYGON ((33.904 -0.95, 34.073 -1.06, 37.699 -... #> 2 603253 Africa W. Sahara ESH 906.5 POLYGON ((-8.666 27.656, -8.665 27.589, -8.684... #> 3 35623680 North America Canada CAN 1674000.0 MULTIPOLYGON (((-122.84 49, -122.974 49.003, -... #> 4 326625791 North America United States of America USA 18560000.0 MULTIPOLYGON (((-122.84 49, -120 49, -117.031 ... #> .. ... ... ... ... ... ... #> 172 7111024 Europe Serbia SRB 101800.0 POLYGON ((18.83 45.909, 18.83 45.909, 19.596 4... #> 173 642550 Europe Montenegro MNE 10610.0 POLYGON ((20.071 42.589, 19.802 42.5, 19.738 4... #> 174 1895250 Europe Kosovo -99 18490.0 POLYGON ((20.59 41.855, 20.523 42.218, 20.284 ... #> 175 1218208 North America Trinidad and Tobago TTO 43570.0 POLYGON ((-61.68 10.76, -61.105 10.89, -60.895... #> 176 13026129 Africa S. Sudan SSD 20880.0 POLYGON ((30.834 3.509, 29.954 4.174, 29.716 4... #> #> [177 rows x 6 columns] ```
:heavy_check_mark: pyogrio.read_dataframe(use_arrow=True, sql = "...") ```python pyogrio.read_dataframe('naturalearth_lowres.vrt', use_arrow=True, sql = "SELECT * FROM naturalearth_lowres") #> pop_est continent name iso_a3 gdp_md_est geometry #> 0 920938 Oceania Fiji FJI 8374.0 MULTIPOLYGON (((20037508.343 -1812498.413, 200... #> 1 53950935 Africa Tanzania TZA 150600.0 POLYGON ((3774143.866 -105758.362, 3792946.708... #> 2 603253 Africa W. Sahara ESH 906.5 POLYGON ((-964649.018 3205725.605, -964597.245... #> 3 35623680 North America Canada CAN 1674000.0 MULTIPOLYGON (((-13674486.249 6274861.394, -13... #> 4 326625791 North America United States of America USA 18560000.0 MULTIPOLYGON (((-13674486.249 6274861.394, -13... #> .. ... ... ... ... ... ... #> 172 7111024 Europe Serbia SRB 101800.0 POLYGON ((2096126.508 5765757.958, 2096127.988... #> 173 642550 Europe Montenegro MNE 10610.0 POLYGON ((2234260.104 5249565.284, 2204305.52 ... #> 174 1895250 Europe Kosovo -99 18490.0 POLYGON ((2292095.761 5139344.949, 2284604.344... #> 175 1218208 North America Trinidad and Tobago TTO 43570.0 POLYGON ((-6866186.192 1204901.071, -6802177.4... #> 176 13026129 Africa S. Sudan SSD 20880.0 POLYGON ((3432408.751 390883.649, 3334408.389 ... #> #> [177 rows x 6 columns] ```
:x: pyogrio.open_arrow(sql=None) ```python with pyogrio.open_arrow('naturalearth_lowres.vrt', use_pyarrow=True) as (meta, reader): print(f"crs = {meta['crs']}") batch = reader.read_next_batch() print(gpd.GeoDataFrame.from_arrow(batch)) #> crs = EPSG:3857 #> pop_est continent name iso_a3 gdp_md_est wkb_geometry #> 0 920938 Oceania Fiji FJI 8374.0 MULTIPOLYGON (((180 -16.06713, 180 -16.55522, ... #> 1 53950935 Africa Tanzania TZA 150600.0 POLYGON ((33.90371 -0.95, 34.07262 -1.05982, 3... #> 2 603253 Africa W. Sahara ESH 906.5 POLYGON ((-8.66559 27.65643, -8.66512 27.58948... #> 3 35623680 North America Canada CAN 1674000.0 MULTIPOLYGON (((-122.84 49, -122.97421 49.0025... #> 4 326625791 North America United States of America USA 18560000.0 MULTIPOLYGON (((-122.84 49, -120 49, -117.0312... #> .. ... ... ... ... ... ... #> 172 7111024 Europe Serbia SRB 101800.0 POLYGON ((18.82982 45.90887, 18.82984 45.90888... #> 173 642550 Europe Montenegro MNE 10610.0 POLYGON ((20.0707 42.58863, 19.80161 42.50009,... #> 174 1895250 Europe Kosovo -99 18490.0 POLYGON ((20.59025 41.85541, 20.52295 42.21787... #> 175 1218208 North America Trinidad and Tobago TTO 43570.0 POLYGON ((-61.68 10.76, -61.105 10.89, -60.895... #> 176 13026129 Africa S. Sudan SSD 20880.0 POLYGON ((30.83385 3.50917, 29.9535 4.1737, 29... #> #> [177 rows x 6 columns] ```
:heavy_check_mark: pyogrio.open_arrow(sql="...") ```python with pyogrio.open_arrow('naturalearth_lowres.vrt', sql = "SELECT * FROM naturalearth_lowres", use_pyarrow=True) as (meta, reader): print(f"crs = {meta['crs']}") batch = reader.read_next_batch() print(gpd.GeoDataFrame.from_arrow(batch)) #> crs = EPSG:3857 #> pop_est continent name iso_a3 gdp_md_est _ogr_geometry_ #> 0 920938 Oceania Fiji FJI 8374.0 MULTIPOLYGON (((20037508.343 -1812498.413, 200... #> 1 53950935 Africa Tanzania TZA 150600.0 POLYGON ((3774143.866 -105758.362, 3792946.708... #> 2 603253 Africa W. Sahara ESH 906.5 POLYGON ((-964649.018 3205725.605, -964597.245... #> 3 35623680 North America Canada CAN 1674000.0 MULTIPOLYGON (((-13674486.249 6274861.394, -13... #> 4 326625791 North America United States of America USA 18560000.0 MULTIPOLYGON (((-13674486.249 6274861.394, -13... #> .. ... ... ... ... ... ... #> 172 7111024 Europe Serbia SRB 101800.0 POLYGON ((2096126.508 5765757.958, 2096127.988... #> 173 642550 Europe Montenegro MNE 10610.0 POLYGON ((2234260.104 5249565.284, 2204305.52 ... #> 174 1895250 Europe Kosovo -99 18490.0 POLYGON ((2292095.761 5139344.949, 2284604.344... #> 175 1218208 North America Trinidad and Tobago TTO 43570.0 POLYGON ((-6866186.192 1204901.071, -6802177.4... #> 176 13026129 Africa S. Sudan SSD 20880.0 POLYGON ((3432408.751 390883.649, 3334408.389 ... #> #> [177 rows x 6 columns] ```
theroggy commented 6 hours ago

This has recently been fixed in GDAL: https://github.com/OSGeo/gdal/pull/11293

The fix will be included in GDAL 3.10.1 when it is released...

program-- commented 6 hours ago

Ah, not sure how I missed that 🤦🏾. Thank you!