apache / iceberg-python

Apache PyIceberg
https://py.iceberg.apache.org/
Apache License 2.0
402 stars 147 forks source link

Better error messages when creating a table with unsupported types #860

Open vtk9 opened 3 months ago

vtk9 commented 3 months ago

Feature Request / Improvement

Related to https://github.com/apache/iceberg-python/issues/830 (reproducer included)

Creating a iceberg table using an arrow table that contains an unsupported type (such as date64). The resultant message is TypeError: Unsupported type: date64[ms]

It would be great if this error message also printed out the column name that has this unsupported type.

Even better, instead of raising a TypeError can a more specific error be returned (such as UnsupportedPyArrowType) which includes the pyarrow.Field (column_name, column_type) as an attribute so that this error can be caught and different exception re-raised based on information contained inside UnsupportedPyArrowType

For example, something like

try:
   ...
except UnsupportedPyArrowType as e:
   raise NewException(f"failure due to {e.field.name}")
kevinjqliu commented 3 months ago

Heres the relevant code

https://github.com/apache/iceberg-python/blob/a6cd0cf325b87b360077bad1d79262611ea64424/pyiceberg/io/pyarrow.py#L932

vivek378521 commented 3 months ago

I want to work on this issue, but I cannot find a contributing-doc in the repo on how to set-up the project and run tests.

@kevinjqliu

kevinjqliu commented 3 months ago

@vivek378521 The docs are located here.

https://py.iceberg.apache.org/contributing/ https://github.com/apache/iceberg-python/blob/main/mkdocs/docs/contributing.md