How embarrassing - I didn't even follow my own instructions. The column revenue should be added to a copy of data.train, not the dataframe itself. This makes revenue appear in the dataframe after calling.
To recreate:
print('Revenue before?', 'revenue' in data.train.columns)
percentile_values = explore_utils.find_customer_revenue_percentiles(
data,
percentiles)
print('Revenue after?', 'revenue' in data.train.columns)
https://github.com/MichiganDataScienceTeam/googleanalytics/blob/fb6650b0bfdfa5e536af2c6db30c9c1a9f6e1cd1/explore_utils.py#L40
How embarrassing - I didn't even follow my own instructions. The column
revenue
should be added to a copy of data.train, not the dataframe itself. This makesrevenue
appear in the dataframe after calling.To recreate:
Output: