Open ishowshao opened 11 months ago
https://cdn.yinyuezhushou.com/static/7d38770d31c3cd66219eaa1b7959e2dd.pdf
Expected output: the text in file
try { $parser = new \Smalot\PdfParser\Parser(); $pdf = $parser->parseFile($path); $text = $pdf->getText(); return preg_replace('/\s+/', '', $text); } catch (Exception $e) { $logger = self::getLogger('pdf2text'); $logger->warning($e->getMessage(), ['path' => $path]); return ''; }
Please try again with 2.8.0-RC2 and get back to us.
I also encountered the same problem when reading Chinese text. In PDF version 1.4, Chinese characters can be read.
PDF version: 1.6(Acrobat 7.x)
Description:
PDF input
https://cdn.yinyuezhushou.com/static/7d38770d31c3cd66219eaa1b7959e2dd.pdf
Expected output & actual output
Expected output: the text in file
Code