NRCan / geo-deep-learning

Deep learning applied to georeferenced datasets
https://geo-deep-learning.readthedocs.io/en/latest/
MIT License
149 stars 49 forks source link

[bug] Tiling creates an invalid geometry inside vector patch (.geojson) #431

Closed remtav closed 1 year ago

remtav commented 1 year ago

The attached vector ground truth patch contains an invalid geomatry for building footprint. As it is not empty, the tiling script proceeds to computing its bounds, but this step fails and hinders the entire tiling process (see #430 ):

Stack trace:

Traceback (most recent call last):
  File "/fs/vnas_Hnrcan/geobase/ret000/geo-deep-learning/GDL.py", line 83, in <module>
    run_gdl()
  File "/space/partner/nrcan/geobase/work/opt/miniconda-gdl-ops/envs/geo_deep_env/lib/python3.10/site-packages/hydra/main.py", line 90, in decorated_main
    _run_hydra(
  File "/space/partner/nrcan/geobase/work/opt/miniconda-gdl-ops/envs/geo_deep_env/lib/python3.10/site-packages/hydra/_internal/utils.py", line 389, in _run_hydra
    _run_app(
  File "/space/partner/nrcan/geobase/work/opt/miniconda-gdl-ops/envs/geo_deep_env/lib/python3.10/site-packages/hydra/_internal/utils.py", line 452, in _run_app
    run_and_report(
  File "/space/partner/nrcan/geobase/work/opt/miniconda-gdl-ops/envs/geo_deep_env/lib/python3.10/site-packages/hydra/_internal/utils.py", line 296, in run_and_report
    raise ex
  File "/space/partner/nrcan/geobase/work/opt/miniconda-gdl-ops/envs/geo_deep_env/lib/python3.10/site-packages/hydra/_internal/utils.py", line 213, in run_and_report
    return func()
  File "/space/partner/nrcan/geobase/work/opt/miniconda-gdl-ops/envs/geo_deep_env/lib/python3.10/site-packages/hydra/_internal/utils.py", line 453, in <lambda>
    lambda: hydra.run(
  File "/space/partner/nrcan/geobase/work/opt/miniconda-gdl-ops/envs/geo_deep_env/lib/python3.10/site-packages/hydra/_internal/hydra.py", line 132, in run
    _ = ret.return_value
  File "/space/partner/nrcan/geobase/work/opt/miniconda-gdl-ops/envs/geo_deep_env/lib/python3.10/site-packages/hydra/core/utils.py", line 260, in return_value
    raise self._return_value
  File "/space/partner/nrcan/geobase/work/opt/miniconda-gdl-ops/envs/geo_deep_env/lib/python3.10/site-packages/hydra/core/utils.py", line 186, in run_job
    ret.return_value = task_function(task_cfg)
  File "/fs/vnas_Hnrcan/geobase/ret000/geo-deep-learning/GDL.py", line 62, in run_gdl
    task(cfg)
  File "/fs/vnas_Hnrcan/geobase/ret000/geo-deep-learning/tiling_segmentation.py", line 764, in main
    line_tuple = tiler.filter_and_burn_dataset(aoi,
  File "/fs/vnas_Hnrcan/geobase/ret000/geo-deep-learning/tiling_segmentation.py", line 514, in filter_and_burn_dataset
    min_annot_success, annot_perc = self.passes_min_annot(
  File "/fs/vnas_Hnrcan/geobase/ret000/geo-deep-learning/tiling_segmentation.py", line 394, in passes_min_annot
    annot_perc = annot_percent(
  File "/fs/vnas_Hnrcan/geobase/ret000/geo-deep-learning/tiling_segmentation.py", line 50, in annot_percent
    bounds_iou = AOI.bounds_iou_gdf_riodataset(gdf_patch, img_patch_dataset)
  File "/fs/vnas_Hnrcan/geobase/ret000/geo-deep-learning/dataset/aoi.py", line 622, in bounds_iou_gdf_riodataset
    label_bounds_box = box(*label_bounds.tolist())
  File "/space/partner/nrcan/geobase/work/opt/miniconda-gdl-ops/envs/geo_deep_env/lib/python3.10/site-packages/shapely/geometry/geo.py", line 64, in box
    return Polygon(coords)
  File "/space/partner/nrcan/geobase/work/opt/miniconda-gdl-ops/envs/geo_deep_env/lib/python3.10/site-packages/shapely/geometry/polygon.py", line 261, in __init__
    ret = geos_polygon_from_py(shell, holes)
  File "/space/partner/nrcan/geobase/work/opt/miniconda-gdl-ops/envs/geo_deep_env/lib/python3.10/site-packages/shapely/geometry/polygon.py", line 539, in geos_polygon_from_py
    ret = geos_linearring_from_py(shell)
  File "shapely/speedups/_speedups.pyx", line 413, in shapely.speedups._speedups.geos_linearring_from_py
ValueError: GEOSGeom_createLinearRing_r returned a NULL pointer

Ground truth patch: mos_15_21l11_ne_30cm_f07_clipped_251855_4_5167007_999999999.zip

Source vector data: QC27_MTM7_ORTHO30cm_2015.zip

To reproduce:

import numpy as np
from shapely.geometry import box

from dataset.aoi import AOI
from utils.geoutils import check_gdf_load

problem_patch_gt = "path/to/mos_15_21l11_ne_30cm_f07_clipped_251855_4_5167007_999999999.geojson"

gdf = check_gdf_load(problem_patch_gt)
label_gdf_filtered = AOI.filter_gdf_by_attribute(
    gdf.copy(deep=True),
    'Quatreclasses',
    [4],
)
label_bounds = label_gdf_filtered.total_bounds
valid_series = label_gdf_filtered.is_valid
valid_ndarray = valid_series.to_numpy()
assert not gdf.empty  # succeeds

label_bounds_box = box(*label_bounds.tolist())  # Fails
assert np.all(valid_ndarray)  # Also fails

# Succeeds...
try:
    label_bounds_box = box(*label_bounds.tolist())
except ValueError as e:
    print(e)

Strangely, the source ground truth geopackage from which the tiling was done contains no invalid geometries:

source_gt = "path/to/QC27_MTM7_ORTHO30cm_2015.gpkg"

gdf = gpd.read_file(gpkg)
gdf_buildings = AOI.filter_gdf_by_attribute(
    gdf.copy(deep=True),
    'Quatreclasses',
    [4],
)

valid_series = gdf_buildings.is_valid
valid_ndarray = valid_series.to_numpy()
assert np.all(valid_ndarray)   # succeeds

This means it is our tiling process that creates an invalid geometry. To be investigated.

remtav commented 1 year ago

A good one for you @ms888ekb (not a priority, unless this problem occurs more often!)

remtav commented 1 year ago

It just popped in my head: the feature that contains a broken geometry in patch is original a circle (for circular building). The geojson specification probably doesn't support that type of geometry, thus the invalid result.

For info, the source layer is an atypical (?) "CurvePolygon" layer: image

Source data: image

Patch: image

ms888ekb commented 1 year ago

I'm looking into this issue currently. Will see if it is possible to manipulate CurvePolygon geometry using existing tools.