Currently, if an EBCDIC data fails to cast to the proper type, for example, when wrong bytes are provided for COMP-3 decoding, Cobrix will silently return null.
It would be great if such casting errors are gathered in a special column in the returned dataset.
spark-csv adds '_corrupted_record' column. when it can't parse the CSC record.
In Cobrix case, the column name can be chosen by the user, and it should be an array of issues.
Feature
Add an option to store casting errors in a separate field.
Background
Currently, if an EBCDIC data fails to cast to the proper type, for example, when wrong bytes are provided for COMP-3 decoding, Cobrix will silently return null.
It would be great if such casting errors are gathered in a special column in the returned dataset.
spark-csv
adds '_corrupted_record' column. when it can't parse the CSC record.In Cobrix case, the column name can be chosen by the user, and it should be an array of issues.
Feature
Add an option to store casting errors in a separate field.
Example
Which might return something like:
Proposed Solution
Add errors only if the setting is enabled. This might have performance and output size inpact.