AbsaOSS / spark-commons

Apache License 2.0
7 stars 0 forks source link

Implement error handling by putting the info into _Spark_'s standard error column (`String`) #86

Open benedeki opened 1 year ago

benedeki commented 1 year ago

Background

One of the provided ErrorHandling implementations. Title is actually little misleading, point is to write the errors into string column, and the column name should default into spark.sql.columnNameOfCorruptRecord (See Runtime SQL Configuration)

Feature

Write errors into a StringType column, by converting each error submit filed into a string and concatenating them with a delimiter. The column name should/might default to spark.sql.columnNameOfCorruptRecord

Proposed Solution

Solution Ideas: