Open UKVeteran opened 2 years ago
https://towardsdatascience.com/data-transformation-and-feature-engineering-e3c7dfbb4899
Destin Gong
Why need data transformation?
## data scaling methods ## from sklearn.preprocessing import StandardScaler from sklearn.preprocessing import MinMaxScaler from sklearn.preprocessing import RobustScaler scale_var = ['Enrollment_Length', 'Recency', 'NumStorePurchases', 'clipped_Age', 'clipped_NumWebVisitsMonth'] scalers_list = [StandardScaler(), RobustScaler(), MinMaxScaler()] for i in range(len(scalers_list)): scaler = scalers_list[i] fig = plt.figure(figsize = (26, 5)) plt.title(scaler, fontsize = 20) for j in range(len(scale_var)): var = scale_var[j] scaled_var = "scaled_" + var model = scaler.fit(df[var].values.reshape(-1,1)) df[scaled_var] = model.transform(df[var].values.reshape(-1, 1)) sub = fig.add_subplot(1, 5, j + 1) sub.set_xlabel(var) df[scaled_var].plot(kind = 'hist')
Thank you for contributing! This looks very useful!
TL;DR
Article Link
https://towardsdatascience.com/data-transformation-and-feature-engineering-e3c7dfbb4899
Author
Destin Gong
Key Takeaways
Why need data transformation?
Useful Code Snippets
Useful Tools
Comments/ Questions