Defra-Data-Science-Centre-of-Excellence / pyspark-vector-files

Read vector files into a Spark DataFrame with geometry encoded as WKB.
https://defra-data-science-centre-of-excellence.github.io/pyspark-vector-files/
MIT License
5 stars 1 forks source link

Doesn't handle non-geometry layers #2

Closed EFT-Defra closed 2 years ago

EFT-Defra commented 2 years ago

For example,

read_vector_files(
  path="/path/to/land_parcels/",
  suffix=".gdb",
  layer_identifier="RLR_RW_RO_APHA_LC"
).show()

Will error with:

---------------------------------------------------------------------------
PythonException                           Traceback (most recent call last)
<command-3464987235349161> in <module>
----> 1 lc_sdf.show()

/databricks/spark/python/pyspark/sql/dataframe.py in show(self, n, truncate, vertical)
    488         """
    489         if isinstance(truncate, bool) and truncate:
--> 490             print(self._jdf.showString(n, 20, vertical))
    491         else:
    492             print(self._jdf.showString(n, int(truncate), vertical))

/databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py in __call__(self, *args)
   1302 
   1303         answer = self.gateway_client.send_command(command)
-> 1304         return_value = get_return_value(
   1305             answer, self.gateway_client, self.target_id, self.name)
   1306 

/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
    121                 # Hide where the exception came from that shows a non-Pythonic
    122                 # JVM exception message.
--> 123                 raise converted from None
    124             else:
    125                 raise

PythonException: An exception was thrown from a UDF: 'AttributeError: 'NoneType' object has no attribute 'ExportToWkb''. Full traceback below:
Traceback (most recent call last):
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-b336c2ee-7619-4e07-bad2-ae7f695ac5c7/lib/python3.8/site-packages/esa_geo_utils/io/_generate_parallel_reader.py", line 320, in _
    return _pdf_from_vector_file(
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-b336c2ee-7619-4e07-bad2-ae7f695ac5c7/lib/python3.8/site-packages/esa_geo_utils/io/_generate_parallel_reader.py", line 247, in _pdf_from_vector_file
    pdf = _pdf_from_layer(
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-b336c2ee-7619-4e07-bad2-ae7f695ac5c7/lib/python3.8/site-packages/esa_geo_utils/io/_generate_parallel_reader.py", line 208, in _pdf_from_layer
    return PandasDataFrame(data=features_generator, columns=feature_names)
  File "/databricks/python/lib/python3.8/site-packages/pandas/core/frame.py", line 563, in __init__
    data = list(data)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-b336c2ee-7619-4e07-bad2-ae7f695ac5c7/lib/python3.8/site-packages/esa_geo_utils/io/_generate_parallel_reader.py", line 39, in <genexpr>
    return ((_get_properties(feature) + _get_geometry(feature)) for feature in layer)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-b336c2ee-7619-4e07-bad2-ae7f695ac5c7/lib/python3.8/site-packages/esa_geo_utils/io/_generate_parallel_reader.py", line 34, in _get_geometry
    return (feature.GetGeometryRef().ExportToWkb(),)
AttributeError: 'NoneType' object has no attribute 'ExportToWkb'