jekwatt / idiomatic_pandas

Tips and tricks for the most common data handling task with pandas.
0 stars 0 forks source link

apply, map, applymap, replace #16

Open jekwatt opened 3 years ago

jekwatt commented 3 years ago

These will go under notebooks/03_modifying.

Related code: http://bit.ly/Pandas-05

jekwatt commented 3 years ago

apply() be used for DataFrame or Series, but the results are different:

DataFrame:

df.appy(len)
df.apply(len, axis="columns")

Series:

df["col_1"].apply(len)
len(df["col_1"])
df.apply(pd.Series.min)
df.apply(lambda x: x.min())  # same as above
jekwatt commented 3 years ago

applymap() only works on DataFrame. Use applymap() to apply function to every individual element/value in DataFrame.

eg
df.applymap(len)
df.applymap(str.lower)  # not str.lower()
jekwatt commented 3 years ago

map() only works on Series. Use map() for substituting each value in Series with another value. Pass the dictionary of values you want to substitute:

# map method does not have inplace argument
d1 = {"k1": "v1", "k2": "v2", "k3": "v3"}
df["col_1"] = df["col_1"].map(d1)

[warning] Values we didn't substitute will be converted to NaN. To fix this, use replace() method.

jekwatt commented 3 years ago

To keep some values and substitute other values, use replace() method. If you want to set this to actual field, you will need to do:

d1 = {"k1": "v1", "k2": "v2", "k3": "v3"}
df["col_1"] = df["col_1"].replace(d1)
jekwatt commented 3 years ago

After the change, apply to the DataFrame by setting inplace is equal to True. df.rename(columns=d1, inplace=True)