Open dingdinger1008 opened 2 years ago
Yes, further cleaning is needed. I removed all the non-English letters in order to achieve the accuracy of the text analysis, and this step will produce some continuous spaces, and a space should be used to replace these spaces. And delete the stop words and restore parts of speech, so as to ensure the accuracy of the subsequent NLP analysis.
The initial text clean-up sample shows that after the removal of punctuation and numbers in the text clean-up step there is still some meaningless content, does it need further cleaning?