apache / iceberg

Apache Iceberg
https://iceberg.apache.org/
Apache License 2.0
6.49k stars 2.24k forks source link

Spark:read iceberg table data error #11336

Open beyond-up opened 1 month ago

beyond-up commented 1 month ago

Apache Iceberg version

1.5.2

Query engine

Spark

Please describe the bug 🐞

When I used iceberg-spark-runtime-3.3_2.12-1.5.2.jar to query the iceberg table data, an error was reported. The error message showed that there were null values, but there was no null value data in the table. image

image

Willingness to contribute

nastra commented 1 month ago

@beyond-up can you share the full stack trace please? Usually there's some more info in other parts of the stack trace that show what went wrong

beyond-up commented 1 month ago

@beyond-up can you share the full stack trace please? Usually there's some more info in other parts of the stack trace that show what went wrong

I have found the cause of this problem. It is because there are '' in the data field in the table. However, I am surprised that '' in a String type field can cause an NPE error! @nastra

nastra commented 1 month ago

@beyond-up so far the NPE seems to be coming from Spark itself, not from Iceberg. Do you have a small reproducible example?

nastra commented 1 month ago

Which exact Spark version are you using? A similar issue was reported in https://issues.apache.org/jira/browse/SPARK-39061 and was already fixed in Spark 3.3.1

beyond-up commented 1 month ago

@beyond-up so far the NPE seems to be coming from Spark itself, not from Iceberg. Do you have a small reproducible example?

This problem will be reproduced when a string type field in the table is all '' ; My Spark version is 3.5 and I used iceberg-spark-runtime-3.5_2.13.jar @nastra

nastra commented 1 month ago

@beyond-up in that case you might want to use a more recent Spark version that includes a potential fix for this