pyjanitor-devs / pyjanitor

Clean APIs for data cleaning. Python implementation of R package Janitor
https://pyjanitor-devs.github.io/pyjanitor
MIT License
1.36k stars 170 forks source link

[EHN] let `column_name` or `column_names` support callback type #1115

Open Zeroto521 opened 2 years ago

Zeroto521 commented 2 years ago

Brief Description

There always has an option called column_name or column_names. It's used to select the columns of df. The type could be a single value like Hashable and list values likes Iterable[Hashable]. And this idea is to support callback type.

_Originally posted by @ericmjl in https://github.com/pyjanitor-devs/pyjanitor/pull/1112#discussion_r895257358_

Example API

An implicit style and also a trick to select columns.

Select columns Using callable type Using Iterable type df.columns
Select the first three columns lambda df: df.columns[:3] ['a', 'b', 'c'] pd.Index(list('abcde'))
Select str type columns lambda df: [i for i in df.columns if instance(i, str)] ['a', 'b', 'c'] pd.Index(['a', 'b', 'c', 1])
samukweku commented 2 years ago

For this, we can pass callables to select_columns