Closed gokceuludogan closed 1 year ago
The langid library has now replaced the langdetect library, eliminating the issue of inaccurate identification of uppercase texts. In this new setup, author names and journal titles are managed through occurrence-based metrics, ensuring that section names remain intact.
The langdetect library inaccurately identifies uppercase texts as non-Turkish, which could be considered a bug. However, this behavior can be advantageous for filtering out author names and journal titles while sacrificing section names.