Open sonali-sr opened 1 year ago
Just a quick question; With your given example of "The CEO was indicted", wouldn't CEO be removed because of the 4th condition of words with length >= 4?
Just a quick question; With your given example of "The CEO was indicted", wouldn't CEO be removed because of the 4th condition of words with length >= 4?
yes good flag! that was just an example :)
2.1 Preprocess the data by removing stopwords, punctuation, and non-alpha words (5 points)
A. Write a function that:
contents
column from that dataframeB. Use
apply
or list comprehension to execute that function and create a new column in the data calledprocessed_text
.Note: there will be a deduction if your code uses a non-list comprehension for loop that uses append.
Resources:
Here's code examples for the snowball stemmer: https://www.geeksforgeeks.org/snowball-stemmer-nlp/
Here's code with topic modeling steps: https://github.com/rebeccajohnson88/PPOL564_slides_activities/blob/main/activities/fall_22/solutions/09_textasdata_partII_topicmodeling_solution.ipynb