Describe the bug
When mixing drop_invalid_rows on DataFrameSchema and Column level we get a non intuitive behavior.
If you set drop_invalid_rows as a DataFrameSchema parameter and have no drop_invalid_rows as column parameter, all rows which fail the validation are dropped. Works as expected.
When setting drop_invalid_rows as column parameter and not as DataFrameSchema parameter, columns which fail are not dropped and no error is raised. Listing [1]
If set drop_invalid_rows=True on DataFrameSchema and at a Column. Columns with drop_invalid_rows=True are not dropped and no error is risen and columns with drop_invalid_rows=False are dropped. Listing [2]
If this behavior is indented, we should document it, otherwise see the expected results
For listing [1] I would expect the columns to be dropped with drop_invalid_rows=True or get a warning that I have to set drop_invalid_rows=True as DataFrameSchema parameter
For listing [2] I would expect the columns with drop_invalid_rows=True as column parameter to be dropped and the other to raise an error.
Describe the bug When mixing
drop_invalid_rows
onDataFrameSchema
andColumn
level we get a non intuitive behavior.drop_invalid_rows
as aDataFrameSchema
parameter and have nodrop_invalid_rows
as column parameter, all rows which fail the validation are dropped. Works as expected.drop_invalid_rows
as column parameter and not asDataFrameSchema
parameter, columns which fail are not dropped and no error is raised. Listing [1]drop_invalid_rows=True
onDataFrameSchema
and at aColumn
. Columns withdrop_invalid_rows=True
are not dropped and no error is risen and columns withdrop_invalid_rows=False
are dropped. Listing [2]If this behavior is indented, we should document it, otherwise see the expected results
Code Sample
Listing [1]
Listing [2]
Expected behavior
For listing [1] I would expect the columns to be dropped with
drop_invalid_rows=True
or get a warning that I have to setdrop_invalid_rows=True
asDataFrameSchema
parameter For listing [2] I would expect the columns withdrop_invalid_rows=True
as column parameter to be dropped and the other to raise an error.Desktop