Deprecate functions ? - Githubissues

pyjanitor-devs / pyjanitor

Clean APIs for data cleaning. Python implementation of R package Janitor

https://pyjanitor-devs.github.io/pyjanitor

MIT License

1.36k stars 169 forks source link

Deprecate functions ? #1045

Open samukweku opened 2 years ago

samukweku commented 2 years ago

Central point to discuss functions to deprecate, if any?

[x] process_text - transform_columns covers this very well
[x] impute vs fill_empty - impute has the advantage of extra statistics functions (mean, mode, ...)
[x] rename_columns - use pandas rename
[x] rename_column - use pd.rename
[x] remove_columns - use pd.drop or select
[x] filter_on - use query or select
[x] fill_direction - use transform_columns or pd.DataFrame.assign
[x] groupby_agg - use transform_columns - once by is implemented
[x] then - use pd.DataFrame.pipe
[x] to_datetime - use jn.transform_columns
[x] pivot_wider - use pd.DataFrame.pivot

Zeroto521 commented 2 years ago

filter_on is also duplicated to query.

>>> import pandas as pd
>>> import janitor
>>> df = pd.DataFrame({
...     "student_id": ["S1", "S2", "S3"],
...     "score": [40, 60, 85],
... })
>>> df
  student_id  score
0         S1     40
1         S2     60
2         S3     85
>>> df.filter_on("score < 50", complement=False)
  student_id  score
0         S1     40
>>> df.query("score < 50")
  student_id  score
0         S1     40

Zeroto521 commented 2 years ago

remove_columns is duplicated to drop.
rename_column is duplicated rename.

ericmjl commented 2 years ago

Good points, y'all.

I'm in favour of dropping filter_on.

remove_columns is explicitly syntactic sugar for df.drop(), and rename_column(s) is syntactic sugar for rename. I guess they can be, drum roll, dropped 😆 too.

Also in favour of the two that you mentioned, @samukweku. They are internal duplications of functionality. We should put in a long, long deprecation warning like we did when factorize_column replaced label_encode. Deprecate when we hit 1.0!

Zeroto521 commented 1 year ago

A decorator may simplify that.

from functools import wraps
from warnings import warn

def warning(
    message: str,
    category: Exception = None,
    stacklevel: int = 1,
    **kwargs
):
    """
    A warning decorator.

    Parameters
    ----------
    message : str
        The warning information to user.

    category : Exception, optional
        If given, must be a **warning category class**. it defaults to
        :exc:`UserWarning`.

    stacklevel : int, default 1
        Default to find the first place in the stack.

    **kwargs
        See the documentation for :meth:`warnings.warn` for complete details on
        the keyword arguments.

    See Also
    --------
    warnings.warn

    Examples
    --------
    >>> from dtoolkit.util._decorator import warning
    >>> @warning("This's a warning message.")
    ... def func(*args, **kwargs):
    ...     ...
    >>> func()
    """

    def decorator(func):
        @wraps(func)
        def wrapper(*f_args, **f_kwargs):
            warn(message, category=category, stacklevel=stacklevel, **kwargs)

            return func(*f_args, **f_kwargs)

        return wrapper

    return decorator

samukweku commented 1 year ago

utils has a refactored_function that covers this usecase - sometimes I forget that utils has some handy functions

samukweku commented 1 year ago

@pyjanitor-devs/core-devs in the spirit of deprecations, I suggest we deprecate transform_columns and transform_column in favour of mutate - it will cover single or multiple transformations

ericmjl commented 1 year ago

I'm in favour!