Open deepakkumar365 opened 9 years ago
You can use TCPDF for that. The PdfParser is using TCPDF but only taking text nodes.
Thank aik099... Ya that's right pdf parser only returns the text Then can you give some example --- how to use TCPDF to read image....
I have no idea. Probably TCPDF documentation is best place to look at this.
a bit late to the party but for future reference:
$parser = new PdfParser\Parser();
$pdf = $parser->parseFile('/your/pdf/file');
$pdf->getObjectsByType('XObject', 'Image');
foreach($images as $image) {
/** @var \Smalot\PdfParser\Object $image */
$content = $image->getContent();
}
As I understand, at least, we should save result of the method getObjectsByType
, because it returns array of objects or something like that. And maybe that's why variable $images was undefined in foreach loop.
If we write $images = $pdf->getObjectsByType('XObject', 'Image');
will it be correct? I thought, that then in foreach loop on each iteration in $image variable will be stored an image from PDF file and we could output it somehow, but I dont know exactly how, because $content contains a lot of symbols which cant be printed as image using imagecreatefromstring()
function or something like that
@chikaldirick @smalot @andreiciobotar have you guys found any solution to decode the $content symbols so we can print the image? I've tried many decoder/filter/functions but still failed. Thanks
This is how I did it :
$parser = new \Smalot\PdfParser\Parser();
$pdf = $parser->parseFile('/your/pdf/file');
$images = $pdf->getObjectsByType('XObject', 'Image');
foreach( $images as $image ) {
echo '<img src="data:image/jpg;base64,'. base64_encode($image->getContent()) .'" />';
}
@jetonr thank you, it works.. but some images failed to be printed, I guess it has incorrect $image->getContent() result.
Hi, i need some help to parse image from pdf file, parsing the text is awesome and hats off to you job... i need to parse the images from the pdf file can you help me... thankx in advance...