kestra-io / plugin-serdes

https://kestra.io/plugins/plugin-serdes/
Apache License 2.0
2 stars 5 forks source link

Add handling for corrupted rows in Readers e.g. enum property `onBadLines` with options ERROR, WARN or SKIP #80

Open anna-geller opened 8 months ago

anna-geller commented 8 months ago

Feature description

in pandas, there is a "on_bad_lines" property:

on_bad_lines{‘error’, ‘warn’, ‘skip’}, default ‘error’
Specifies what to do upon encountering a bad line (a line with too many fields). Allowed values are :

'`error`', raise an Exception when a bad line is encountered.

'`warn`', raise a warning when a bad line is encountered and skip that line.

'`skip`', skip bad lines without raising or warning when they are encountered.

It's worth adding an enum property onBadLines with options ERROR, WARN or SKIP to all Readers to allow a more configurable handling of bad lines.