nikwilms / ESG-Score-Prediction-from-Sustainability-Reports

This repository contains code and data for a machine learning model that predicts ESG (Environmental, Social, and Governance) scores based on sustainability reports and company data. It's a valuable resource for researchers, investors, and sustainability professionals interested in ESG score prediction using machine learning techniques.
MIT License
15 stars 2 forks source link

text extraction from pdfs #7

Closed mariusbosch closed 10 months ago

mariusbosch commented 10 months ago

Convert the PDFs into plain text. You can use libraries such as PyPDF2, pdfminer, or pdfplumber for this purpose in Python. Store the extracted text for each PDF in a structured format (e.g., CSV or a database).