Toblerity / Fiona

Fiona reads and writes geographic data files
https://fiona.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.14k stars 201 forks source link

Fiona <-> OGR field mappings take 2 (refactor) #1366

Closed sgillies closed 3 months ago

sgillies commented 3 months ago

Follow up on #1358.

The project's original mappings predated OGR's sub-types (https://gdal.org/development/rfc/rfc50_ogr_field_subtype.html) and those were kinda bolted on. In this version we embrace them. We map (fieldtype, subtype) tuples to Fiona field types.

The extremely gross sections of if/else conditionals in the OGR and Fiona feature builders have been reduced to looking up a class in one of the maps and calling one of its methods. The increase in lines of code is largely explained by the new small classes and lack of normalization.

Original tests pass, save one, which had the wrong expected value.

Benchmarking: this script runs in 0.49 seconds with Fiona 1.9.6 and 0.44 seconds with the code in this PR. Measured by cProfile. So, a ~10% speedup for a 288 record dataset with features that have 60+ fields.

import fiona

with fiona.open(
    "ne_10m_admin_0_countries_lakes/ne_10m_admin_0_countries_lakes.shp"
) as src:
    with fiona.open("/tmp/test_perf.shp", "w", **src.profile) as dst:
        dst.writerecords(src)

This PR is about making the code more readable and easier to maintain. A small speedup is a nice bonus.

When reviewing, use the "split" view, it's much easier to read for these particular changes.

sgillies commented 3 months ago

@mwtoews @snorfalorpagus @perrygeo I'm tagging you for review of this refactor PR. Meanwhile I'm going to work on benchmarking.

sgillies commented 3 months ago

Last call for review! It's a lot to digest, I know, and if you're primarily using fiona via geopandas, maybe not interesting. Still, I appreciate a skim if you can.