Open bit-man opened 7 months ago
Tryed to follow code at convertCore.php
and seems the failing code is at if (!in_array(strtolower($oldExtension), $pdf1array)) . This evaluation results in false and thus no attempt to convert is made which makes no sense to me because its supposed to be the Code to convert a PDF to a document, as stated by the previous line comment
Stripped of the negation and an file si downloaded but is empty :cry: . Still not working The log output follows :
Op-Act, May 1, 2024, 8:43 pm, 856ca1146d63/1029442e5485: Initiating Converter.
Op-Act, May 1, 2024, 8:43 pm, 856ca1146d63/1029442e5485: User selected to perform OCR on file m1m2.pdf.
Op-Act, May 1, 2024, 8:43 pm, 856ca1146d63/1029442e5485: Copying file m1m2.pdf to /var/www/html/HRProprietary/HRConvert2/DATA/856ca1146d63/1029442e5485/m1m2.pdf.
Op-Act, May 1, 2024, 8:43 pm, 856ca1146d63/1029442e5485: Copied file m1m2.pdf.
Op-Act, May 1, 2024, 8:43 pm, 856ca1146d63/1029442e5485: Verified file /DATA/HRConvert2/856ca1146d63/1029442e5485/m1m2.txt.
Op-Act, May 1, 2024, 8:43 pm, 856ca1146d63/1029442e5485: Performing OCR intermediate operation using method 0.
Op-Act, May 1, 2024, 8:43 pm, 856ca1146d63/1029442e5485: Converted file /var/www/html/HRProprietary/HRConvert2/DATA/856ca1146d63/1029442e5485/m1m2.jpg to /var/www/html/HRProprietary/HRConvert2/DATA/856ca1146d63/1029442e5485/m1m2.txt.
Op-Act, May 1, 2024, 8:43 pm, 856ca1146d63/1029442e5485: Performing OCR final using method 0.
Op-Act, May 1, 2024, 8:43 pm, 856ca1146d63/1029442e5485: Renamed file /var/www/html/HRProprietary/HRConvert2/DATA/856ca1146d63/1029442e5485/m1m2.pdf to /var/www/html/HRProprietary/HRConvert2/DATA/856ca1146d63/1029442e5485/m1m2.txt.
Op-Act, May 1, 2024, 8:43 pm, 856ca1146d63/1029442e5485: Created a file at /DATA/HRConvert2/856ca1146d63/1029442e5485/m1m2.txt.
No time today to do a followup. Will try the weekend or else. Happy if anyone else can continue from here Added this change to https://github.com/bit-man/HRConvert2 in case anyone wants to try a fix
Sorry for the delayed response. Can you try the following.....
sudo leafpad /etc/ImageMagick-6/policy.xml
Find and edit the following line.....
<policy domain="coder" rights="none" pattern="PDF" />
.....To.....
<policy domain="coder" rights="read|write" pattern="PDF" />
And let me know the result.
I am not satisfied myself with OCR performance of PDF files lately. I've known for some time that the functions for OCR need to be refactored. This is mentioned in CHANGELOG.txt several times, I'm sure of it.
Look for a refactor of the OCR related functions hopefully before v3.4 comes out. This is some of the oldest code left in the codebase today. Most of it pre-dates the v2.7 Valkyre -> Diablo re-write.
Uploading a PDF file and trying to OCR (method: simple, format : txt) by pressing button Convert into Document opens a new tab with the error Not Found and no file is downloaded
At docker console the error show is
Doing tail of txt log at Logs folder shows