Is your feature request related to a problem? Please describe.
Enabling and showing the example of how to extend pydeequ.analyzers._AnalyzerObject to define custom Analyzer in python.
Describe the solution you'd like
Be able to implement:
class MyCustomAnalyzer(_AnalyzerObject):
"""Get the maximum of a numeric column."""
def __init__(self, column, my_property: str = None):
"""
:param str column: column to find the maximum.
:param str my_property: custom property
"""
self.column = column
self.my_property = my_property
@property
def _analyzer_jvm(self, foo: AnalyzerInput) -> AnalyzerOutput:
# my custom transformation that transforms well defined AnalyzerInput into AnalyzerOutput
bar: AnalyzerOutput = ...
return bar
Describe alternatives you've considered
When calculating Anomalies - every time I have a custom metrics (to focus attention - lets say Sum() / CountDistinct()) I build temporary table that has one row, ex:
and than run Anomaly over pydeequ.analyzers.Sum (or Mean, ie. transformation that gives identity). Its best if those custom metrics have seperate pydeequ metrics repository to the source table.
Additional context
In anybody hacked it in a better way than described in Describe alternatives you've considered let us know in the comments!
Is your feature request related to a problem? Please describe. Enabling and showing the example of how to extend pydeequ.analyzers._AnalyzerObject to define custom Analyzer in python.
Describe the solution you'd like Be able to implement:
and then run it in VerificationSuite, ex:
Describe alternatives you've considered When calculating Anomalies - every time I have a custom metrics (to focus attention - lets say
Sum() / CountDistinct()
) I build temporary table that has one row, ex:and than run Anomaly over pydeequ.analyzers.Sum (or Mean, ie. transformation that gives identity). Its best if those custom metrics have seperate pydeequ metrics repository to the source table.
Additional context In anybody hacked it in a better way than described in Describe alternatives you've considered let us know in the comments!