thiagoalessio / tesseract-ocr-for-php

A wrapper to work with Tesseract OCR inside PHP.
https://packagist.org/packages/thiagoalessio/tesseract_ocr
MIT License
2.86k stars 551 forks source link

setOutputFile does not support Multi Files #187

Open paulb057 opened 4 years ago

paulb057 commented 4 years ago

When I run "tesseract 10002ucw.tif 10002ucw-new pdf txt" I get two files saved to the hard drive one Text and one PDF.

When I use the tesseract-ocr-for-php it creates the two files in the temp folder however only renames one of them so one gets lost.

echo (new TesseractOCR("test/10002ucw.tif")) ->configFile('pdf txt') ->setOutputFile("test/10002ucw-new2") ->lang("eng") ->command;

"tesseract" "test/10002ucw.tif" "/tmp/ocr0BXIOX" -l eng pdf txt

I quickly put in this hack in the TesseractOCT.PHP run() to fix it. I put the TXT last in my list and I call it as followings

echo (new TesseractOCR("test/10002ucw.tif"))
->configFile('pdf txt')
->setOutputFile("test/10002ucw-new2")
->lang("eng")
->run();

    if ($this->command->useFileAsOutput) {
        if (strpos($this->command->configFile, ' ') !== false) {
            $text = file_get_contents($this->command->getOutputFile());
            if ($this->outputFile !== null) {
                rename($this->command->getOutputFile(), $this->outputFile.".txt");
                rename(str_replace(".txt",".pdf",$this->command->getOutputFile()), $this->outputFile.".pdf");
            }
            $this->cleanTempFiles();
        } else {
            $text = file_get_contents($this->command->getOutputFile());
            if ($this->outputFile !== null) {
                rename($this->command->getOutputFile(), $this->outputFile);
            }
            $this->cleanTempFiles();
        }
    }

Environment

kaisirerp2all commented 3 years ago

Dear, brother Can you please help me to find out how to run tesseract on cPanel for Laravel project