Open tfardet opened 3 weeks ago
Is there a way to tell
pyogrio
to include this tag as an independent column?
this is purely a OGR OSM driver topic. You can tune its configuation: see https://gdal.org/drivers/vector/osm.html#configuration
OK, if I understand correctly, I would need to add "building:levels" to the "attributes" in the osmconf.ini file, right?
However, that means manual changes to a config file, meaning that I cannot rely on that for code that will be distributed to others. Is there a way to achieve this GDAL configuration programmatically via pyogrio or some other python library?
I would need to add "building:levels" to the "attributes" in the osmconf.ini file, right?
yes
I'm not super familiar with pyogrio, but looking at https://github.com/geopandas/pyogrio/blob/af292e579572a6a33a22a6403873b8e7b0a9d7f6/docs/source/known_issues.md?plain=1#L94 and given that the config file can be passed as an open option (https://gdal.org/drivers/vector/osm.html#open-options)
I assume you could do something like df = read_dataframe(path, CONFIG_FILE="/path/to/your/osmconf.ini")
with your code creating a temporary file
That could be a "/vsimem/" in-memory file (https://gdal.org/user/virtual_file_systems.html#vsimem-in-memory-files)
If you use GDAL Python bindings, then you can create it with something like
from osgeo import gdal
f = gdal.VSIFOpenL("/vsimem/osmconf.ini", "wb")
data = b"put here content of osmconf.ini"
gdal.VSIFWriteL(data, 1, len(data), f)
gdal.VSIFCloseL(f)
df = read_dataframe(path, CONFIG_FILE="/vsimem/osmconf.ini")
gdal.Unlink("/vsimem/osmconf.ini")
Thanks, I'll check whether this config file argument works with a custom ini file
I ran into this exact issue today! I didn't try the vsimem
approach, but was able to use a regular temporary file.
I wanted to modify a copy of the system ini file programmatically rather than editing it manually. Python's configparser
won't eat an ini file without a top-level header, so I had to add one pre-read then remove it post-write.
import configparser
import io
import tempfile
import geopandas as gpd
with tempfile.NamedTemporaryFile("w", suffix=".ini") as f_tmp:
# Prefix the file with a toplevel header, then pass to configparser.
dummy_header = "[dummy_toplevel_header]\n"
config = configparser.ConfigParser()
with open("/usr/share/gdal/osmconf.ini") as f_config:
stream = io.StringIO(dummy_header + f_config.read())
config.read_file(stream)
# Set the config for the building layer: add levels and related tags, remove other_tags.
config["multipolygons"]["attributes"] = "name,building:levels,levels,height,min_height,max_height"
config["multipolygons"]["other_tags"] = "no"
# Write to temp file.
config.write(f_tmp, space_around_delimiters=False)
f_tmp.flush()
# Remove the first line dummy header.
with open(f_tmp.name, "r") as f:
lines = f.readlines()
with open(f_tmp.name, "w") as f:
assert lines[0] == dummy_header
f.writelines(lines[1:])
# Now you can read in the file.
gdf = gpd.read_file(osm_pbf_path, layer="multipolygons", CONFIG_FILE=f_tmp.name)
Result:
Python's
configparser
won't eat an ini file without a top-level header, so I had to add one pre-read then remove it post-write.
will be fixed per https://github.com/OSGeo/gdal/pull/10293
At the moment, not all tags are imported from an
osm.pbf
file withread_dataframe
. In particular, I'm interested in "buildings:levels", which currently ends up in "other_tags" rather than getting its own column (maybe because not all lines have this entry?).Is there a way to tell
pyogrio
to include this tag as an independent column? I could extract it from the data in the "other_tags" column (e.g."building:levels"=>"1"
) but this is probably going to be much slower than if it's done directly on import.EDIT: in case anyone is interested in a workaround in the meantime