rebeccajohnson88 / PPOL564_slides_activities

Repo for Georgetown McCourt's School of Public Policy's Data Science I (PPOL 564)
Creative Commons Zero v1.0 Universal
9 stars 13 forks source link

2.1 - A and B #55

Open sonali-sr opened 1 year ago

sonali-sr commented 1 year ago

2.1 Preprocess the data by removing stopwords, punctuation, and non-alpha words (5 points)

A. Write a function that:

B. Use apply or list comprehension to execute that function and create a new column in the data called processed_text.

Note: there will be a deduction if your code uses a non-list comprehension for loop that uses append.

Resources:

sanhatahir commented 1 year ago

Just a quick question; With your given example of "The CEO was indicted", wouldn't CEO be removed because of the 4th condition of words with length >= 4?

rebeccajohnson88 commented 1 year ago

Just a quick question; With your given example of "The CEO was indicted", wouldn't CEO be removed because of the 4th condition of words with length >= 4?

yes good flag! that was just an example :)