louismartin / email-classification-challenge

Altegard challenge in collaboration w/ Linagora
https://inclass.kaggle.com/c/master-data-science-mva-data-competition-2017
2 stars 1 forks source link

Refactor #21

Closed louismartin closed 7 years ago

louismartin commented 7 years ago

Modify cleaning methods to take strings as arguments for code re usability (especially for graph of words).

You need to update your code from

df_train["clean body"] = clean(df_train["body"], add_book)

to

df_train["clean body"] = df_train["body"].apply(lambda x: clean(x, add_book))
df_train["clean body"] = df_train["clean body"].fillna("")