Closed goldyaphets closed 5 months ago
It says hocr has a issue with the language you used Have you tried not using that? It says it's for Roman alphabet only
i've choosed the sandwich option,but still not work
Error Internal Server Error:Command process failed with exit code 8. Error message: DEBUG ocrmypdf - ocrmypdf 15.4.2 DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--version'] DEBUG ocrmypdf.subprocess - Found tesseract 5.3.4 DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--version'] DEBUG ocrmypdf.subprocess - Running: ['gs', '--version'] DEBUG ocrmypdf.subprocess - Found gs 10.2.1 DEBUG ocrmypdf.subprocess - Running: ['gs', '--version'] DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--list-langs'] DEBUG ocrmypdf.subprocess.tesseract - stdout/stderr = [DS] Profile read from file (tesseract_opencl_profile_devices.dat). [DS] Device[1] 0:(null) score is 0.197839 [DS] Selected Device[1]: "(null)" (Native) List of available languages in "/usr/share/tessdata/" (166): Arabic Armenian Bengali Canadian_Aboriginal Cherokee Cyrillic Devanagari Ethiopic Fraktur Georgian Greek Gujarati Gurmukhi HanS HanS_vert HanT HanT_vert Hangul Hangul_vert Hebrew Japanese Japanese_vert Kannada Khmer Lao Latin Malayalam Myanmar Oriya Sinhala Syriac Tamil Telugu Thaana Thai Tibetan Vietnamese afr amh ara asm aze aze_cyrl bel ben bod bos bre bul cat ceb ces chi_sim chi_sim_vert chi_tra chi_tra_vert chr cos cym dan dan_frak deu deu_frak div dzo ell eng enm epo equ est eus fao fas fil fin fra frk frm fry gla gle glg grc guj hat heb hin hrv hun hye iku ind isl ita ita_old jav jpn jpn_vert kan kat kat_old kaz khm kir kmr kor kor_vert lao lat lav lit ltz mal mar mkd mlt mon mri msa mya nep nld nor oci ori osd pan pol por pus que ron rus san sin slk slk_frak slv snd spa spa_old sqi srp srp_latn sun swa swe syr tam tat tel tgk tgl tha tir ton tur uig ukr urd uzb uzb_cyrl vie yid yor DEBUG ocrmypdf.helpers - pikepdf mmap enabled DEBUG ocrmypdf.helpers - os.symlink(/tmp/input_7407754083048820571.pdf, /tmp/ocrmypdf.io.i5ffpvhz/origin) DEBUG ocrmypdf.helpers - os.symlink(/tmp/ocrmypdf.io.i5ffpvhz/origin, /tmp/ocrmypdf.io.i5ffpvhz/origin.pdf) ERROR ocrmypdf._pipelines._common - ExitCodeException Traceback (most recent call last): File "/usr/lib/python3.11/site-packages/ocrmypdf/_pipelines/_common.py", line 249, in cli_exception_handler return fn(options, plugin_manager) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/ocrmypdf/_pipelines/ocr.py", line 176, in _run_pipeline pdfinfo = get_pdfinfo( ^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/ocrmypdf/_pipeline.py", line 173, in get_pdfinfo return PdfInfo( ^^^^^^^^ File "/usr/lib/python3.11/site-packages/ocrmypdf/pdfinfo/info.py", line 1102, in init raise EncryptedPdfError() # Triggered by encryption with empty passwd ^^^^^^^^^^^^^^^^^^^^^^^^^ ocrmypdf.exceptions.EncryptedPdfError: Input PDF is encrypted. The encryption must be removed to perform OCR. For information about this PDF's security use qpdf --show-encryption infilename You can remove the encryption using qpdf --decrypt [--password=[password]] infilename
The encryption must be removed to perform OCR.
It is encrypted and/or passworded
You need to use the remove password function with no password in it if it's not passworded to decrypt
If it's passworded you must remove the password itself
the problem has been solved , it works flawlessly now thank you so much
java.io.IOException: Command process failed with exit code 8. Error message: DEBUG ocrmypdf - ocrmypdf 15.4.2 WARNING ocrmypdf._validation - The 'hocr' PDF renderer is known to cause problems with one or more of the languages in your document. Use
--pdf-renderer auto
(the default) to avoid this issue. DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--version'] DEBUG ocrmypdf.subprocess - Found tesseract 5.3.4 DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--version'] DEBUG ocrmypdf.subprocess - Running: ['gs', '--version'] DEBUG ocrmypdf.subprocess - Found gs 10.2.1 DEBUG ocrmypdf.subprocess - Running: ['gs', '--version'] DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--list-langs'] DEBUG ocrmypdf.subprocess.tesseract - stdout/stderr = [DS] Profile read from file (tesseract_opencl_profile_devices.dat). [DS] Device[1] 0:(null) score is 0.197839List of available languages in "/usr/share/tessdata/" (166): Arabic Armenian Bengali Canadian_Aboriginal Cherokee Cyrillic Devanagari Ethiopic Fraktur Georgian Greek Gujarati Gurmukhi HanS HanS_vert HanT HanT_vert Hangul Hangul_vert Hebrew Japanese Japanese_vert Kannada Khmer Lao Latin Malayalam Myanmar Oriya Sinhala Syriac Tamil Telugu Thaana Thai Tibetan Vietnamese afr amh ara asm aze aze_cyrl bel ben bod bos bre bul cat ceb ces chi_sim chi_sim_vert chi_tra chi_tra_vert chr cos cym dan dan_frak deu deu_frak div dzo ell eng enm epo equ est eus fao fas fil fin fra frk frm fry gla gle glg grc guj hat heb hin hrv hun hye iku ind isl ita ita_old jav jpn jpn_vert kan kat kat_old kaz khm kir kmr kor kor_vert lao lat lav lit ltz mal mar mkd mlt mon mri msa mya nep nld nor oci ori osd pan pol por pus que ron rus san sin slk slk_frak slv snd spa spa_old sqi srp srp_latn sun swa swe syr tam tat tel tgk tgl tha tir ton tur uig ukr urd uzb uzb_cyrl vie yid yor
DEBUG ocrmypdf.helpers - pikepdf mmap enabled DEBUG ocrmypdf.helpers - os.symlink(/tmp/input_12232272785051626322.pdf, /tmp/ocrmypdf.io.j6bc0lib/origin) DEBUG ocrmypdf.helpers - os.symlink(/tmp/ocrmypdf.io.j6bc0lib/origin, /tmp/ocrmypdf.io.j6bc0lib/origin.pdf) ERROR ocrmypdf._pipelines._common - ExitCodeException Traceback (most recent call last): File "/usr/lib/python3.11/site-packages/ocrmypdf/_pipelines/_common.py", line 249, in cli_exception_handler return fn(options, plugin_manager) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/ocrmypdf/_pipelines/ocr.py", line 176, in _run_pipeline pdfinfo = get_pdfinfo( ^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/ocrmypdf/_pipeline.py", line 173, in get_pdfinfo return PdfInfo( ^^^^^^^^ File "/usr/lib/python3.11/site-packages/ocrmypdf/pdfinfo/info.py", line 1102, in init raise EncryptedPdfError() # Triggered by encryption with empty passwd ^^^^^^^^^^^^^^^^^^^^^^^^^ ocrmypdf.exceptions.EncryptedPdfError: Input PDF is encrypted. The encryption must be removed to perform OCR.
For information about this PDF's security use qpdf --show-encryption infilename
You can remove the encryption using qpdf --decrypt [--password=[password]] infilename