nikwilms / ESG-Score-Prediction-from-Sustainability-Reports

This repository contains code and data for a machine learning model that predicts ESG (Environmental, Social, and Governance) scores based on sustainability reports and company data. It's a valuable resource for researchers, investors, and sustainability professionals interested in ESG score prediction using machine learning techniques.
MIT License
15 stars 2 forks source link

Stemming/Lemmatization #17

Closed mariusbosch closed 10 months ago

mariusbosch commented 10 months ago

Stemming reduces words to their root form. For example, "running", "runner", "ran" becomes "run". Lemmatization is similar but reduces words to their base or dictionary form. For instance, "is", "am", "are" become "be". You can use libraries like NLTK or spaCy for this. Choose either stemming or lemmatization based on the context of your analysis.