ocrmypdf / OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
http://ocrmypdf.readthedocs.io/
Mozilla Public License 2.0
13.91k stars 1.01k forks source link

ocrmypdf works via terminal, but not via php(apache) #960

Closed xinandri8 closed 2 years ago

xinandri8 commented 2 years ago

Describe the bug The problem I have is that I can execute the ocrmypdf command from the terminal, but when I execute it from a php (apache) it does not work. I have already given root permissions to that user and it still does not work. I have also tried with shell_exec and exec. It shows me this failure after using: $var = shell_exec('ocrmypdf -l eng --pdf-renderer hocr name.pdf name.pdf --force-ocr 2>&1');

string(4230) "Traceback (most recent call last): File "/usr/local/bin/ocrmypdf", line 6, in from ocrmypdf.__main__ import run File "/usr/local/lib/python3.7/site-packages/ocrmypdf/__init__.py", line 14, in from ocrmypdf.api import Verbosity, configure_logging, ocr File "/usr/local/lib/python3.7/site-packages/ocrmypdf/api.py", line 20, in from ocrmypdf._sync import run_pipeline File "/usr/local/lib/python3.7/site-packages/ocrmypdf/_sync.py", line 50, in from ocrmypdf._validation import ( File "/usr/local/lib/python3.7/site-packages/ocrmypdf/_validation.py", line 43, in verify_python3_env() File "/usr/local/lib/python3.7/site-packages/ocrmypdf/_unicodefun.py", line 117, in verify_python3_env 'environment.' + extra RuntimeError: ocrmypdf will abort further execution because Python 3 was configured to use ASCII as encoding for the environment. This system lists a couple of UTF-8 supporting locales that you can pick from. The following suitable locales were discovered: aa_DJ.utf8, aa_ER.utf8, aa_ET.utf8, af_ZA.utf8, am_ET.utf8, an_ES.utf8, ar_AE.utf8, ar_BH.utf8, ar_DZ.utf8, ar_EG.utf8, ar_IN.utf8, ar_IQ.utf8, ar_JO.utf8, ar_KW.utf8, ar_LB.utf8, ar_LY.utf8, ar_MA.utf8, ar_OM.utf8, ar_QA.utf8, ar_SA.utf8, ar_SD.utf8, ar_SY.utf8, ar_TN.utf8, ar_YE.utf8, as_IN.utf8, ast_ES.utf8, ayc_PE.utf8, az_AZ.utf8, be_BY.utf8, bem_ZM.utf8, ber_DZ.utf8, ber_MA.utf8, bg_BG.utf8, bho_IN.utf8, bn_BD.utf8, bn_IN.utf8, bo_CN.utf8, bo_IN.utf8, br_FR.utf8, brx_IN.utf8, bs_BA.utf8, byn_ER.utf8, ca_AD.utf8, ca_ES.utf8, ca_FR.utf8, ca_IT.utf8, crh_UA.utf8, cs_CZ.utf8, csb_PL.utf8, cv_RU.utf8, cy_GB.utf8, da_DK.utf8, de_AT.utf8, de_BE.utf8, de_CH.utf8, de_DE.utf8, de_LU.utf8, doi_IN.utf8, dv_MV.utf8, dz_BT.utf8, el_CY.utf8, el_GR.utf8, en_AG.utf8, en_AU.utf8, en_BW.utf8, en_CA.utf8, en_DK.utf8, en_GB.utf8, en_HK.utf8, en_IE.utf8, en_IN.utf8, en_NG.utf8, en_NZ.utf8, en_PH.utf8, en_SG.utf8, en_US.utf8, en_ZA.utf8, en_ZM.utf8, en_ZW.utf8, es_AR.utf8, es_BO.utf8, es_CL.utf8, es_CO.utf8, es_CR.utf8, es_CU.utf8, es_DO.utf8, es_EC.utf8, es_ES.utf8, es_GT.utf8, es_HN.utf8, es_MX.utf8, es_NI.utf8, es_PA.utf8, es_PE.utf8, es_PR.utf8, es_PY.utf8, es_SV.utf8, es_US.utf8, es_UY.utf8, es_VE.utf8, et_EE.utf8, eu_ES.utf8, fa_IR.utf8, ff_SN.utf8, fi_FI.utf8, fil_PH.utf8, fo_FO.utf8, fr_BE.utf8, fr_CA.utf8, fr_CH.utf8, fr_FR.utf8, fr_LU.utf8, fur_IT.utf8, fy_DE.utf8, fy_NL.utf8, ga_IE.utf8, gd_GB.utf8, gez_ER.utf8, gez_ET.utf8, gl_ES.utf8, gu_IN.utf8, gv_GB.utf8, ha_NG.utf8, he_IL.utf8, hi_IN.utf8, hne_IN.utf8, hr_HR.utf8, hsb_DE.utf8, ht_HT.utf8, hu_HU.utf8, hy_AM.utf8, ia_FR.utf8, id_ID.utf8, ig_NG.utf8, ik_CA.utf8, is_IS.utf8, it_CH.utf8, it_IT.utf8, iu_CA.utf8, iw_IL.utf8, ja_JP.utf8, ka_GE.utf8, kk_KZ.utf8, kl_GL.utf8, km_KH.utf8, kn_IN.utf8, ko_KR.utf8, kok_IN.utf8, ks_IN.utf8, ku_TR.utf8, kw_GB.utf8, ky_KG.utf8, lb_LU.utf8, lg_UG.utf8, li_BE.utf8, li_NL.utf8, lij_IT.utf8, lo_LA.utf8, lt_LT.utf8, lv_LV.utf8, mag_IN.utf8, mai_IN.utf8, mg_MG.utf8, mhr_RU.utf8, mi_NZ.utf8, mk_MK.utf8, ml_IN.utf8, mn_MN.utf8, mni_IN.utf8, mr_IN.utf8, ms_MY.utf8, mt_MT.utf8, my_MM.utf8, nb_NO.utf8, nds_DE.utf8, nds_NL.utf8, ne_NP.utf8, nhn_MX.utf8, niu_NU.utf8, niu_NZ.utf8, nl_AW.utf8, nl_BE.utf8, nl_NL.utf8, nn_NO.utf8, nr_ZA.utf8, nso_ZA.utf8, oc_FR.utf8, om_ET.utf8, om_KE.utf8, or_IN.utf8, os_RU.utf8, pa_IN.utf8, pa_PK.utf8, pap_AN.utf8, pl_PL.utf8, ps_AF.utf8, pt_BR.utf8, pt_PT.utf8, ro_RO.utf8, ru_RU.utf8, ru_UA.utf8, rw_RW.utf8, sa_IN.utf8, sat_IN.utf8, sc_IT.utf8, sd_IN.utf8, se_NO.utf8, shs_CA.utf8, si_LK.utf8, sid_ET.utf8, sk_SK.utf8, sl_SI.utf8, so_DJ.utf8, so_ET.utf8, so_KE.utf8, so_SO.utf8, sq_AL.utf8, sq_MK.utf8, sr_ME.utf8, sr_RS.utf8, ss_ZA.utf8, st_ZA.utf8, sv_FI.utf8, sv_SE.utf8, sw_KE.utf8, sw_TZ.utf8, szl_PL.utf8, ta_IN.utf8, ta_LK.utf8, te_IN.utf8, tg_TJ.utf8, th_TH.utf8, ti_ER.utf8, ti_ET.utf8, tig_ER.utf8, tk_TM.utf8, tl_PH.utf8, tn_ZA.utf8, tr_CY.utf8, tr_TR.utf8, ts_ZA.utf8, tt_RU.utf8, ug_CN.utf8, uk_UA.utf8, unm_US.utf8, ur_IN.utf8, ur_PK.utf8, ve_ZA.utf8, vi_VN.utf8, wa_BE.utf8, wae_CH.utf8, wal_ET.utf8, wo_SN.utf8, xh_ZA.utf8, yi_US.utf8, yo_NG.utf8, yue_HK.utf8, zh_CN.utf8, zh_HK.utf8, zh_SG.utf8, zh_TW.utf8, zu_ZA.utf8 "

System (please complete the following information):

Installation pip

jbarlow83 commented 2 years ago

This is an old version of ocrmypdf on a very old version of Linux, and it appears that the system locale is misconfigured. Please use a Docker container or newer operating than CentOS 7.