dscpesu / SimpleML

🚀 SimpleML: Dive into the exciting world of machine learning with our open-source repository, offering a diverse array of projects suitable for beginners and experts alike.
https://gdscpesu.com/
MIT License
15 stars 15 forks source link

Amazon Reviews Sentiment Analysis: Clean and prepare the dataset by following practices such as removing stop-words, non alphanumeric characters etc. #8

Closed asphytheghoul closed 11 months ago

asphytheghoul commented 1 year ago

Field Description
About A short Description about project
Github Your Github name
Email
Label Update request

Define You

Is your feature request related to a problem? Please describe.

Describe the solution you'd like...

Describe alternatives you've considered?

Approach to be followed (optional):

Additional context

saqlain2204 commented 12 months ago

Solution: Data contains stop words that have no direct correlation with sentiment. Non-numeric characters play a minimal role in a text-dominated review. To improve the data quality, it's advisable to remove emojis, normalize alphabet casing, and delete extra spaces etc. I would like to work on this issue. Can you assign it to me