opendatalab / MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
https://opendatalab.com/OpenSourceTools?tool=extract
GNU Affero General Public License v3.0
18.18k stars 1.3k forks source link

Request for Bengali Language Support in OCR #1032

Open raselmeya94 opened 3 days ago

raselmeya94 commented 3 days ago

Hello [Repository Maintainers/Team],

First of all, thank you for creating and maintaining this amazing OCR project! It is truly impressive and has a lot of potential for multilingual applications.

I noticed that the project currently does not support Bengali, one of the world's top 10 most spoken languages. Bengali is spoken by over 230 million people globally, and its inclusion would greatly expand the reach and usability of this tool.

As a native Bengali user and developer, I believe this feature would benefit a vast community, including researchers, students, and professionals who require OCR solutions in Bengali.

I am looking forward to hearing your thoughts!

Best regards, Rasel Meya