boun-tabi-LMG / turkish-academic-text-harvest

MIT License
2 stars 0 forks source link

turkish-academic-text-harvest

This repository contains scripts for downloading articles from Dergipark, a Turkish academic website, as well as Turkish theses. It provides functionality to convert PDF files to text and filter them to produce a dataset for further analysis and research.

The repository is organized into the following directories: