HetWaterschapshuis / HyDAMOValidatieModule

MIT License
0 stars 2 forks source link

Validator returned None with a certain object #22

Open BramWijnants opened 1 month ago

BramWijnants commented 1 month ago

I run into a None-Type return from the validator with certain geopackages. We clip the waterboard data in multiple subareas. Two of these subareas return a None from the validator. It doesnt seem to be an issues regarding duplicate id's.

For the geopackage Gewassen_bovstr_GrBeerze Westelbeers_clipped_20240802_HyDAMO: When I remove one specific object from 'kunstwerkopening' with globalid {EC585B1E-87E2-462B-819B-E991F64176C4} the validator works and doesnt return a NoneType at the end.

For the geopackage Gewassen_bovstr_Scheepdonkseweg_clipped_20240802_HyDAMO.gpkg: A simular issue arises. This time the validator returns a None unless an object in the 'kunstwerkopening' layer is removed with globalid {1E5F41BB-8BB7-4878-AD2D-5798CD10F272}.

These 2 objects in kunstwerkopening are in all subarea geopackages, since the clip wont do anything when the data doesnt contain a geometry. Only in these 2 subareas/geopackages the objects cause the NoneType return of the validator.

We still use an older version. Python 3.9.2 with hydamo-validation 0.9.8 and geopandas 0.10.2. I havent tried it with the latest version.

I added the files to wetransfer, they expire within a week: https://we.tl/t-1Na78mEvuS

Error traceback below. Code used:

It's hard for me to figure out exactly what goes wrong here. If it is solved with the latest version we need to get our environment sorted out. Some help or pointers on what goes wrong would be appreciated.

hydamo_validator = validator(output_types = ["geopackage", "csv", "geojson"],
                         coverages=ahn_coverage,
                         log_level="INFO",
                         schemas_path=schemas_path)
datamodel, layer_summary, result_summary = hydamo_validator(directory=directory, raise_error=False)
result_summary.to_dict()

output:

TypeError                                 Traceback (most recent call last)
<ipython-input-13-285be10ccc5e> in <module>
     24                              log_level="INFO",
     25                              schemas_path=schemas_path)
---> 26     datamodel, layer_summary, result_summary = hydamo_validator(directory=directory, raise_error=False)
     27     result_summary.to_dict()
     28 

TypeError: cannot unpack non-iterable NoneType object
DanielTollenaar commented 1 month ago

What version of the validator are you using?

In general, None will be returned if the validator didn't come to a successful end. To debug this, you'll have to run the validator with hydamo_validator(directory=directory, raise_error=True).

BramWijnants commented 1 month ago

We are using version 0.9.8. Im assuming this also happens with the latest version but havent tested it. If it works with the latest version; switching would be an option. When I set the raise_error it indeed raises an error (duh), see below. The problem seems to occur when writing. Looks like one of the attributes is not of the expected type.

This might hint at similar problems with double id's? ie: It searches for an ID -> gets 2 back and returns an array instead of an object or something?

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-bc5c6c684cf3> in <module>
     24                              log_level="INFO",
     25                              schemas_path=schemas_path)
---> 26     datamodel, layer_summary, result_summary = hydamo_validator(directory=directory, raise_error=True)
     27     result_summary.to_dict()
     28 

P:\WinPython\WPy64-3920Watersysteemtoets\python-3.9.2.amd64\lib\site-packages\hydamo_validation\validator.py in _validator(directory, output_types, log_level, coverages, schemas_path, raise_error)
    265             result_summary.to_json(results_path)
    266         if raise_error:
--> 267             raise e
    268         else:
    269             result_summary.to_dict()

P:\WinPython\WPy64-3920Watersysteemtoets\python-3.9.2.amd64\lib\site-packages\hydamo_validation\validator.py in _validator(directory, output_types, log_level, coverages, schemas_path, raise_error)
    238         logger.info("exporting results")
    239         result_summary.status = "export results"
--> 240         result_layers = layers_summary.export(results_path, output_types)
    241         result_summary.result_layers = result_layers
    242         result_summary.error_layers = [

P:\WinPython\WPy64-3920Watersysteemtoets\python-3.9.2.amd64\lib\site-packages\hydamo_validation\summaries.py in export(self, results_path, output_types)
    159                         file_path = results_path.joinpath("results.gpkg")
    160 
--> 161                         gdf.to_file(
    162                             file_path, layer=object_layer, driver="GPKG", schema=schema
    163                         )

P:\WinPython\WPy64-3920Watersysteemtoets\python-3.9.2.amd64\lib\site-packages\geopandas\geodataframe.py in to_file(self, filename, driver, schema, index, **kwargs)
   1112         from geopandas.io.file import _to_file
   1113 
-> 1114         _to_file(self, filename, driver, schema, index, **kwargs)
   1115 
   1116     def set_crs(self, crs=None, epsg=None, inplace=False, allow_override=False):

P:\WinPython\WPy64-3920Watersysteemtoets\python-3.9.2.amd64\lib\site-packages\geopandas\io\file.py in _to_file(df, filename, driver, schema, index, mode, crs, **kwargs)
    394             filename, mode=mode, driver=driver, crs_wkt=crs_wkt, schema=schema, **kwargs
    395         ) as colxn:
--> 396             colxn.writerecords(df.iterfeatures())
    397 
    398 

P:\WinPython\WPy64-3920Watersysteemtoets\python-3.9.2.amd64\lib\site-packages\fiona\collection.py in writerecords(self, records)
    359         if self.mode not in ('a', 'w'):
    360             raise IOError("collection not open for writing")
--> 361         self.session.writerecs(records, self)
    362         self._len = self.session.get_length()
    363         self._bounds = None

fiona\ogrext.pyx in fiona.ogrext.WritingSession.writerecs()

fiona\ogrext.pyx in fiona.ogrext.OGRFeatureBuilder.build()

ValueError: Invalid field type <class 'numpy.ndarray'>
DanielTollenaar commented 3 days ago

@BramWijnants, can't reproduce in version 1.3.0 of the module, see the results here:

Be aware as of 1.3.0 we use pygrio to write files, see: https://github.com/HetWaterschapshuis/HyDAMOValidatieModule/releases/tag/v1.3.0. The Exception above relates exporting DataFrames with fiona.

Please check and confirm.