Nthiki / nlp-sdg

1 stars 12 forks source link

Team B: Implement data cleaning in data_engineering pipeline #22

Open Luka-Explore opened 2 years ago

Luka-Explore commented 2 years ago

In the data_engineering pipeline, implement the clean_tweet, token_stop_pos, and lemmatize functions. At the end, the pipeline should take in text string data, and output a cleaned version (dataframe) that is stored in the intermediate layer of kedro.

Deadline: Monday end of day 26 September 2022