Open dev-code-davis opened 7 years ago
Update: I managed to pass the location which in this case is "/usr/bin/convert". However, there's a change needed in steve-ferrero/php-pdf-to-image repo's PDFToImage.php. The hardcoded magick string is used instead of property magickPath. My quick fix (Lines 35-36): https://gist.github.com/Gugols/a9baa7e97c225c03725f2dea8e27b54f
However, now I'm not sure of the best way how to pass the imagemagick argument from the TesseractPHP object? Ideas?
Hmm, does this really work for multi-page pdf's? As I a pass it it only tries to unlink just a one file:
string(60) "/usr/bin/convert test.pdf temp/59f0ccb5d2ea3.png 2>&1" array(0) { } Warning: unlink(temp/59f0ccb5d2ea3.png): No such file or directory in /var/www/drupalvm/drupal/ocr_test1/vendor/web-atrio/tesseract-php/TesseractPHP.php on line 113 array(2) { [0]=> string(53) "ERROR: Can not open input file temp/59f0ccb5d2ea3.png" [1]=> string(24) "Error during processing." }
Also, by default Imagick converts the pdf into relatively small png and thus Tesseract is unable to read it. I tried changing to: exec($this->magickPath . " -density 300 " . $this->pdfFile . " " . $outputFile . " 2>&1", $output); It's kind of slow converting process, maybe lower density value works as well. Haven't tested it yet. It probably should be be defined as property.
Hi,
First of all - thanks for creating this tool. However, I'm having a problem where the magick library is not found. I have installed it on Ubuntu 16.04. However, when I try to run the package, I git the error:
It seems that the magick location is set on /web-atrio/php-pdf-to-image/PDFToImage.php: function setMagickPath($magickPath) { $this->magickPath = $magickPath; }
So the question is, which of the Magick files (there are many) to call since 'magick' doesn't seem to work?
My complete code:
https://gist.github.com/Gugols/82183d51ee68a0fd36c46d7d1ef369ae