Open myjob opened 3 days ago
-event="{"category":"Marketing nav","action":"click to go to homepage","label":"ref_page:Marketing;ref_cta:Logomark;ref_loc:Header"}">
-event="{"category":"Marketing nav","action":"click to go to homepage","label":"ref_page:Marketing;ref_cta:Logomark;ref_loc:Header"}">
Description of the bug | 错误描述
as reported in issue #708, detection of Umlaut / vowel mutation (äöüÄÜÖ) in German OCR isnt working well. Furthermore, french accents are not well identified (éèÀ); see attachment miner-u-lang-euro-ocr-test_origin.pdf miner-u-lang-euro-ocr-test.md
How to reproduce the bug | 如何复现
magic-pdf -p miner-u-lang-euro-test.pdf -o ./out -m ocr -l german or magic-pdf -p miner-u-lang-euro-test.pdf -o ./out -m ocr -l french
Operating system | 操作系统
Linux
Python version | Python 版本
3.10
Software version | 软件版本 (magic-pdf --version)
0.9.x
Device mode | 设备模式
cpu