geopandas / pyogrio

Vectorized vector I/O using OGR
https://pyogrio.readthedocs.io
MIT License
272 stars 22 forks source link

ENH: support writing to in-memory (byte) objects #249

Closed jorisvandenbossche closed 5 months ago

jorisvandenbossche commented 1 year ago

We support reading from an in-memory buffer / bytes object (https://github.com/geopandas/pyogrio/issues/22), but not yet writing to it.

As a starter, the write path assumes the path is a string (or we will convert it to a string in several places), and so passing a BytesIO object doesn't work (currently we will actually create a file in the current directory with a name like "<_io.BytesIO object at 0x7f229d2a1a80>" because of calling str(path)).

If we want to support writing to a buffer, our current code for handling this on the read path (buffer_to_virtual_file to create a /vsimem/.. file) will not be sufficient, because this creates a VSIMemFile that doesn't own the buffer's data, and thus can't expand that size of the buffer (which will be needed to write to an empty BytesIO).

From a quick look, two potential strategies:

brendan-ward commented 5 months ago

The second option looks reasonable based on the way this was implemented rasterio, assuming that adding write functionality on top of the read functionality implemented there is relatively straightfroward. We could start with re-implementing the read interface in pyogrio first (#42) and adjust to that, then extend it to enable write.

jorisvandenbossche commented 5 months ago

@brendan-ward in case this is useful, what I started writing (very draft): https://github.com/geopandas/pyogrio/compare/main...jorisvandenbossche:pyogrio:write-vsimem?expand=1 Didn't yet actually wire it up in the writing logic, but the idea is that we would detect if we get passed a file-like object, in that case pass the /vsimem/... path to GDAL, and afterwards read the buffer and write it into the file-like object (like the callback in fiona)

brendan-ward commented 5 months ago

@jorisvandenbossche thanks for the start! I have some ideas on how to proceed now and will work on drafting a PR for this shortly.