MrPowers / quinn

pyspark methods to enhance developer productivity 📣 👯 🎉
https://mrpowers.github.io/quinn/
Apache License 2.0
597 stars 93 forks source link

Add decorators for functions #221

Open kunaljubce opened 3 months ago

kunaljubce commented 3 months ago

Proposed changes

Took another stab at #140, as an extension to #144.

Types of changes

What types of changes does your code introduce to quinn? Put an x in the boxes that apply

SemyonSinchenko commented 3 months ago

Is it a breaking change, isn't it? If so, I would suggest pushing it into 1.0, not main. In the same moment, we need to mark these functions as deprecated in the main and make a new release. @MrPowers what do you think?

kunaljubce commented 3 months ago

Is it a breaking change, isn't it? If so, I would suggest pushing it into 1.0, not main. In the same moment, we need to mark these functions as deprecated in the main and make a new release. @MrPowers what do you think?

Could be a potential breaking change, yes. Would like to discuss more on this. Let me move this to 1.0.

kunaljubce commented 3 months ago

I don't see a branch for 1.0. Did we create it? I remember seeing a label for 1.0, though.

kunaljubce commented 3 months ago

@SemyonSinchenko Also we are not exactly deprecating this functionality but evolving it to serve a better purpose, if I may say so 😆 Would a DeprecationWarning be apt for this?

I would think a better way would be to publish this change under a change log or something.

kunaljubce commented 3 months ago

Implementation details for validate_schema

So basically we are implementing a decorator factory here so that our function can be used both as a decorator as well as a callable function. In this implementation:

When validate_schema is called directly with a DataFrame, the validation logic gets executed by wrapping the DataFrame in a lambda function and immediately calling the decorator.

SemyonSinchenko commented 3 months ago

Hmm, maybe it would be better to leave a current function as is and create a new one? I don't know. @MrPowers @jeffbrennan, what do you think, guys?

jeffbrennan commented 3 months ago

I like the idea of having one function that uses the decorator if no dataframe is provided - we would need to keep the arguments consistent with the current validate_schema implementation though

kunaljubce commented 3 months ago

I like the idea of having one function that uses the decorator if no dataframe is provided - we would need to keep the arguments consistent with the current validate_schema implementation though

@jeffbrennan The only difference in the arguments is the absence of _df since this will be fetched from the base function on which the decorator is applied. The other two arguments required_schema and ignore_nullable are consistent across both scenarios. Is this what you are talking about or did I miss your point?

kunaljubce commented 3 months ago

Hmm, maybe it would be better to leave a current function as is and create a new one? I don't know. @MrPowers @jeffbrennan, what do you think, guys?

@SemyonSinchenko This was the case with #141. I agree with the discussion in there that two different functions achieving a similar objective, one being used as a callable and one being used as a decorator can be confusing to users. That, in fact, was my inspiration to take a fresh attempt at this problem.