mrpowers-io / quinn

pyspark methods to enhance developer productivity 📣 👯 🎉
https://mrpowers-io.github.io/quinn/
Apache License 2.0
647 stars 99 forks source link

Update the validate methods to return a boolean value, so they can be used for control flow #51

Open MrPowers opened 1 year ago

MrPowers commented 1 year ago

The validate_presence_of_columns, validate_absence_of_columns, and validate_schema methods currently throw exceptions when they fail and return nothing otherwise.

It'd be nice to add a flag for these methods to return boolean values. That'd make it easier for folks to add schema safe append logic like this:

if quinn.validate_schema(df, bad_append_df.schema):
    bad_append_df.write.format("parquet").mode("append").save("tmp/parquet1")

Right now, quinn.validate_schema will return None, even if the schema match, and None is falsy in a boolean context.

paulooctavio commented 2 months ago

@SemyonSinchenko Can I take this one?