Closed pud-micha closed 3 months ago
There is no text in the sample document. Looks like the page was saved as an image; OCR would be required to read this.
I haven't assumed that other PDF printers did not print text from HTML. Thank you :) Now I have a solution: Use print as PDF and don't use MS PDF printer.
Description:
Any "pdf printed" website with MS PDF Printer / "save as pdf". It creates a 1.7 PDF. Like the file attached. test.pdf
$parser->parseFile('website.pdf')->getText() is empty.
PDF input
Print any website content does not matter. Windows 10. Chrome 123.0.6312.86.
Expected output & actual output
Some Text but empty.
Code
$parser->parseFile('website.pdf')->getText()