scienxlab / redflag

Safety net for machine learning pipelines. Plays nice with sklearn and pandas.
https://scienxlab.org/redflag
Apache License 2.0
22 stars 6 forks source link

Accessor for `pd.DataFrame` #36

Open kwinkunks opened 1 year ago

kwinkunks commented 1 year ago

Could be interesting to implement detectors etc as methods on DataFrames, eg

df = pd.read_csv('my_data.csv')

df.rf.find_outliers()

How-to: https://pandas.pydata.org/docs/development/extending.html

Could do the same for xarrays I guess, but DataFrames are key.

kwinkunks commented 1 year ago

API like:

df.redflag.cool_method(X=None, y=None, **kwargs)

E.g.

features = ['GR', 'RHOB', 'PE']  # Columns in df.
df.redflag.cool_method(X=features, y='Lithology')  # y can also be a list.

Then...