Open samukweku opened 2 years ago
filter_on
is also duplicated to query
.
>>> import pandas as pd
>>> import janitor
>>> df = pd.DataFrame({
... "student_id": ["S1", "S2", "S3"],
... "score": [40, 60, 85],
... })
>>> df
student_id score
0 S1 40
1 S2 60
2 S3 85
>>> df.filter_on("score < 50", complement=False)
student_id score
0 S1 40
>>> df.query("score < 50")
student_id score
0 S1 40
remove_columns
is duplicated to drop
.rename_column
is duplicated rename
.Good points, y'all.
I'm in favour of dropping filter_on
.
remove_columns
is explicitly syntactic sugar for df.drop()
, and rename_column(s)
is syntactic sugar for rename
. I guess they can be, drum roll, dropped 😆 too.
Also in favour of the two that you mentioned, @samukweku. They are internal duplications of functionality. We should put in a long, long deprecation warning like we did when factorize_column
replaced label_encode
. Deprecate when we hit 1.0!
A decorator may simplify that.
from functools import wraps
from warnings import warn
def warning(
message: str,
category: Exception = None,
stacklevel: int = 1,
**kwargs
):
"""
A warning decorator.
Parameters
----------
message : str
The warning information to user.
category : Exception, optional
If given, must be a **warning category class**. it defaults to
:exc:`UserWarning`.
stacklevel : int, default 1
Default to find the first place in the stack.
**kwargs
See the documentation for :meth:`warnings.warn` for complete details on
the keyword arguments.
See Also
--------
warnings.warn
Examples
--------
>>> from dtoolkit.util._decorator import warning
>>> @warning("This's a warning message.")
... def func(*args, **kwargs):
... ...
>>> func()
"""
def decorator(func):
@wraps(func)
def wrapper(*f_args, **f_kwargs):
warn(message, category=category, stacklevel=stacklevel, **kwargs)
return func(*f_args, **f_kwargs)
return wrapper
return decorator
utils has a refactored_function
that covers this usecase - sometimes I forget that utils has some handy functions
@pyjanitor-devs/core-devs in the spirit of deprecations, I suggest we deprecate transform_columns
and transform_column
in favour of mutate
- it will cover single or multiple transformations
I'm in favour!
Central point to discuss functions to deprecate, if any?
process_text
-transform_columns
covers this very wellimpute
vsfill_empty
-impute
has the advantage of extra statistics functions (mean, mode, ...)rename_columns
- use pandasrename
rename_column
- usepd.rename
remove_columns
- usepd.drop
orselect
filter_on
- usequery
orselect
fill_direction
- usetransform_columns
orpd.DataFrame.assign
groupby_agg
- usetransform_columns
- onceby
is implementedthen
- usepd.DataFrame.pipe
to_datetime
- usejn.transform_columns
pivot_wider
- usepd.DataFrame.pivot