smalot / pdfparser

PdfParser, a standalone PHP library, provides various tools to extract data from a PDF file.
GNU Lesser General Public License v3.0
2.42k stars 538 forks source link

Fatal error: escapeshellarg(): Input string contains NULL bytes #187

Open huzjakd opened 6 years ago

huzjakd commented 6 years ago

This is maybe more of a PHP question than pdfparser one but here it goes anyway: Important part of my code:

$parser = new \Smalot\PdfParser\Parser();
$pdf = $parser->parseFile($file_path);
$pages  = $pdf->getPages();
foreach ($pages as $page) {
  $text = $page->getText();
  $stringWithoutNewLine = str_replace("\n"," ",escapeshellarg($text));
  //some more code
}

And this returns the error in the title. Sure enoguh, I think some of the pages in my .pdf have only images, no text. I would like to get an empty string if there is no text on the page.

huzjakd commented 6 years ago

This can be closed. I found a solution. Removed all NULL characters from the string and serialized the string to json.