Open wheeled opened 4 years ago
A working minimal example would be very helpful.
I just noticed the KML driver is not registered in the supported list of drivers: https://github.com/Toblerity/Fiona/blob/1.8.13.post1/fiona/drvsupport.py#L80
This does not necessarily mean that this driver does not work with Fiona. However, the KML data model is not entirely compatible with the GDAL/OGR data model. This is also mentioned in GDAL's documentation: https://gdal.org/drivers/vector/kml.html As Fiona assumes that a full OGR data model can be used, such drivers are typically excluded by Fiona.
It could be that the driver is sensible to the order in which the fields are written (I'm just guessing). The driver has the following creation options:
Creation Options
The following dataset creation options are supported:
NameField: Allows you to specify the field to use for the KML <name> element. Default value : ‘Name’
DescriptionField: Allows you to specify the field to use for the KML <description> element. Default value : ‘Description’
AltitudeMode: Allows you to specify the AltitudeMode to use for KML geometries. This will only affect 3D geometries and must be one of the valid KML options. See the relevant KML reference material for further information.
These options can be passed to fiona.open
. E.g.
with fiona.open(path, mode='w', NameField='the name field name', DescriptionField='the description field name' ... ) as c:
Here is a simple script that can illustrate the problem. The data file workouts.gpkg
is attached (will need to be unzipped).
import fiona
import geopandas as gpd
gdf = gpd.read_file("workouts.gpkg", driver="GPKG", layer="bike_ride")
gpd.io.file.fiona.drvsupport.supported_drivers['KML'] = 'rw'
with fiona.Env():
gdf.to_file("bike_ride_fiona.kml", driver='KML', NameField='name')
I had found the KML driver notes you reference above, but the addition of the NameField
kwarg didn't make any difference.
My suspicion would be that the OGRFeatureBuilder
in ogrext.pyx
is for some reason not constructing the feature for the KML driver quite right. If I construct the feature directly in ogr it looks like so (when exported to JSON):
{
"type": "Feature",
"geometry": {
"type": "LineString",
"coordinates": [[151.148538533, -33.6983890381], [151.148168283, -33.6982220137], [151.146898883, -33.6979895788], [151.146154005, -33.6978891607]]
},
"properties": {
"Name": "Rode 9.91 km on 9/04/2016", "Description": None, "id": "1419923559", "date": "04/09/2016", "distance": "9.911885322240002", "name": "Rode 9.91 km on 9/04/2016", "speed": "4.644744762061856", "time": "2134", "activity_name": "Bike Ride"
}
}
Note that ogr has added the Name
and Description
fields and correctly mapped the name
field to Name
.
I don't have a C++ environment or I'd be happy to dig deeper. Thanks for your assistance.
@wheeled For the above-mentioned reasons, I hope you understand that Fiona does not support KML. However, it is still strange that data can be written without error, but just the order of the fields is wrong.
It looks like the order of the fields in the schema is relevant. The following code works:
import os
from collections import OrderedDict
import fiona
import logging
import sys
from fiona.crs import from_epsg
from shapely.geometry import shape
with fiona.open('workouts.gpkg') as c:
for f in c:
print(shape(f['geometry']))
logging.basicConfig(stream=sys.stderr, level=logging.DEBUG)
print(fiona.__version__)
print(fiona.get_gdal_release_name())
fiona.drvsupport.supported_drivers['KML'] = 'rw'
fname = 'test.kml'
if os.path.exists(fname):
os.remove(fname)
schema = {'properties': OrderedDict([
('Name', 'str'),
('Description', 'str'),
('id', 'str'),
('distance', 'float'),
('speed', 'float'),
('time', 'int'),
('activity_name', 'str'),
('Description', 'str'),
('date', 'str'), ]),
'geometry': 'LineString'}
with fiona.open("test.kml",
driver='KML',
layer="walk",
mode='w',
schema=schema,
crs=from_epsg(4326)) as c:
f = {
'geometry': {'type': 'LineString',
'coordinates': [(151.148538533, -33.6983890381), (151.148168283, -33.6982220137),
(151.146898883, -33.6979895788), (151.146154005, -33.6978891607)]},
'properties': OrderedDict([
('Name', 'Walked 3.97 km on 22/1/17'),
('Description', 'test'),
('distance', 3.96978493824),
('id', '1965703340'),
('speed', 1.397811597971831),
('time', 2840),
('date', '01/22/2017'),
('activity_name', 'Walk')
])
}
c.write(f)
with open(fname) as f:
print(f.read())
whereas if one of the ExtendData fields is in front of Name
and Description
in schema, the resulting order is wrong.
I found the issue: https://github.com/Toblerity/Fiona/blob/master/fiona/ogrext.pyx#L1119-L1122 Here it is assumed that the ordering of the schema retrieved by gdal is the same as originally specified. However, the KML driver seems to use internally a specific ordering. This code was introduced due to https://github.com/Toblerity/Fiona/issues/105 to handle normalized field names.
The real issue is, that the KML driver automatically adds fields: https://github.com/OSGeo/gdal/blob/master/gdal/ogr/ogrsf_frmts/kml/ogrkmllayer.cpp#L116-L120
https://github.com/Toblerity/Fiona/blob/master/fiona/ogrext.pyx#L1119-L1122 assumes that only the fields in the schema provided to Fiona exist. The same issue is also present for the DXF, GPX, GPSTrackMacker and DGN driver.
Tested by adding:
diff --git a/fiona/ogrext.pyx b/fiona/ogrext.pyx
index a0b2a6c..c8163ca 100644
--- a/fiona/ogrext.pyx
+++ b/fiona/ogrext.pyx
@@ -1083,6 +1083,12 @@ cdef class WritingSession(Session):
encoding = self._get_internal_encoding()
+ _schema = self.get_schema()
+ log.debug("before schema: {}".format(_schema))
+
+ if len(_schema['properties']) > 0:
+ raise ValueError("Driver {} has non empty schema: {}".format(self.collection.driver, _schema))
+
for key, value in collection.schema['properties'].items():
log.debug("Begin creating field: %r value: %r", key, value)
So https://github.com/Toblerity/Fiona/blob/master/fiona/ogrext.pyx#L1119-L1122 explains how the fields get mixed up. If I simulate the behaviour of the dict(zip())
it produces a mapping that agrees with the result:
coll_schema = ["id", "date", "distance", "name", "speed", "time", "activity_name"]
ogr_schema = ["Name", "Description", "id", "date", "distance", "name", "speed", "time", "activity_name"]
schema_mapping = dict(zip(coll_schema, ogr_schema))
print(schema_mapping)
{'id': 'Name', 'date': 'Description', 'distance': 'id', 'name': 'date', 'speed': 'distance', 'time': 'name', 'activity_name': 'speed'}
[note that this process remaps time
to name
and then later on the ogr driver remaps name
to Name
]
I understand that you are not necessarily keen to support KML, but if this issue also affects some other drivers would it be possible to classify it as a bug?
I imagine the fix would be to check that the collection.schema['properties'].keys()
and ogr_schema['properties'].keys()
were the same length prior to zipping them. If all of the offending ogr drivers add their special fields at the beginning of the schema, this might work as a fix:
ogr_schema = self.get_schema()['properties'].keys()
coll_schema = collection.schema['properties'].keys()
offset = len(ogr_schema) - len(coll_schema)
self._schema_mapping = dict(zip(
coll_schema,
ogr_schema[offset:] ))
This assumes that the driver will look after its own Name
and Description
fields.
I apologise again that I am not set up to be able to test it.
Expected behavior and actual behavior.
Attempting to save a GeoPandas dataframe using:
produces a KML file in which the value for
<name>
is taken from another field, and likewise<description>
. If the file is written the hard way (usingogr
) it is correct. Perhaps there are some kwargs that need to be passed along to correct this? In the example aboveNameField='name'
was an unsuccessful attempt.Correct KML using
ogr
:Incorrect KML using
Fiona
(see Placemark name and description):Steps to reproduce the problem.
I don't think there was anything special about the GeoDataFrame, but can provide a script and sample data if the answer is not trivial.
Operating system
Mac OS X 10.15.4
Fiona and GDAL version and provenance
Fiona 1.8.13.post1 loaded by PyCharm GDAL 2.4.0 loaded by PyCharm