Open melvinkokxw opened 1 month ago
Describe the bug Using a custom check with a PySpark dataframe raises the exception AttributeError: 'NoneType' object has no attribute 'name'
AttributeError: 'NoneType' object has no attribute 'name'
The cause for this is that reason_code is not provided raising SchemaError after a failed custom check, specifically here: https://github.com/unionai-oss/pandera/blob/d2bfed03e107358d60266108478711cdbe704e9c/pandera/backends/pyspark/base.py#L99-L107
reason_code
And when collecting errors here: https://github.com/unionai-oss/pandera/blob/d2bfed03e107358d60266108478711cdbe704e9c/pandera/api/base/error_handler.py#L127
Trying to access .name on the non-existent reason_code (i.e. None) causes an AttributeError.
.name
None
AttributeError
import pandera.pyspark as psa import pyspark.sql as ps from pandera.extensions import register_check_method from pyspark.sql import types as T @register_check_method def custom_check(pyspark_df: ps.DataFrame): return False class Schema(psa.DataFrameModel): field1: T.IntegerType() = psa.Field() field2: T.IntegerType() = psa.Field() class Config: custom_check = () spark = ps.SparkSession.builder.appName("example").getOrCreate() schema = T.StructType([ T.StructField("field1", T.IntegerType(), True), T.StructField("field2", T.IntegerType(), True)]) data = [(1, 2)] df = spark.createDataFrame(data, schema) Schema.validate(df)
Validation should fail, and raise a SchemaError (or SchemaErrors?) but not an AttributeError
I just stumbled upon the same issue. Glad someone already reported it and hopefully, it can be fixed soon...
Describe the bug Using a custom check with a PySpark dataframe raises the exception
AttributeError: 'NoneType' object has no attribute 'name'
The cause for this is that
reason_code
is not provided raising SchemaError after a failed custom check, specifically here: https://github.com/unionai-oss/pandera/blob/d2bfed03e107358d60266108478711cdbe704e9c/pandera/backends/pyspark/base.py#L99-L107And when collecting errors here: https://github.com/unionai-oss/pandera/blob/d2bfed03e107358d60266108478711cdbe704e9c/pandera/api/base/error_handler.py#L127
Trying to access
.name
on the non-existentreason_code
(i.e.None
) causes anAttributeError
.Code Sample, a copy-pastable example
Expected behavior
Validation should fail, and raise a SchemaError (or SchemaErrors?) but not an AttributeError
Desktop (please complete the following information):