Open bakkan opened 5 years ago
PHPWord uses getimagesize() function to get image info, getimagesize() doesn't support emf format. 😂😂
I using phpword: dev-master and see error
[Mon, 08 Apr 2019 09:26:50 +0700] [127.0.0.1] [Error(1): Uncaught exception 'PhpOffice\PhpWord\Exception\InvalidImageException' with message 'Invalid image: zip:///opt/lampp/temp/php4WlqvI#word/media/image1.emf' in /media/hongoctrien/DATA/MyHost/NukeViet/module-nvonlinetest-01.vn/vendor/phpoffice/phpword/src/PhpWord/Element/Image.php:418
Stack trace:
#0 /media/hongoctrien/DATA/MyHost/NukeViet/module-nvonlinetest-01.vn/vendor/phpoffice/phpword/src/PhpWord/Element/Image.php(149): PhpOffice\PhpWord\Element\Image->checkImage()
#1 [internal function]: PhpOffice\PhpWord\Element\Image->__construct('zip:///opt/lamp...')
#2 /media/hongoctrien/DATA/MyHost/NukeViet/module-nvonlinetest-01.vn/vendor/phpoffice/phpword/src/PhpWord/Element/AbstractContainer.php(145): ReflectionClass->newInstanceArgs(Array)
#3 [internal function]: PhpOffice\PhpWord\Element\AbstractContainer->addElement('Image', 'zip:///opt/lamp...')
#4 /media/hongoctrien/DATA/MyHost/NukeViet/module-nvonlinetest-01.vn/vendor/phpoffice/phpword/src/PhpWord/Element/AbstractContainer.php(112): call_user_func_array(Array] [FILE: /vendor/phpoffice/phpword/src/PhpWord/Element/Image.php] [LINE: 418]
Any news on this issue? Will this be addressed sooner or later?
I encountered this error just now. I guess EMF format is becoming more commonly used in modern docx files
The same problem for me today. Any news about this issue ?
There isn't any support for .emf file but there is a workaround
Workaround by code : PHPWord includes template processing for this.
include 'vendor/autoload.php';
$templateProcessor = new \PhpOffice\PhpWord\TemplateProcessor('test2.docx');
$templateProcessor->setValue('name', 'myvar');
$templateProcessor->saveAs('./xx.docx');
https://phpword.readthedocs.io/en/latest/templates-processing.html https://stackoverflow.com/a/53039632/4693790
You may write a prepareDocxReplaceEMF($docxPath)
function that do all of these actions on a docx file, before working with phpword
renaming docx to zip is not needed .
Use PHP ZipArchive to extract "YOURDOC.docx\word_rels\document.xml.rels" https://www.php.net/manual/en/ziparchive.extractto.php
Replace EMF references in file https://stackoverflow.com/a/69155428/4693790
Use PHP ZipArchive to zip document.xml.rels back https://www.php.net/manual/en/ziparchive.addfile.php
Use PHP ZipArchive to extract emf file https://www.php.net/manual/en/ziparchive.extractto.php
Use ImageMagick to convert the EMF FILE https://imagemagick.org/script/formats.php https://www.php.net/manual/fr/book.imagick.php
Use PHP ZipArchive to zip jpeg file back https://www.php.net/manual/en/ziparchive.addfile.php
Workaround that worked for me
private function removeImageReferences($zip, $placeholderImagePath)
{
$relsPath = 'word/_rels/document.xml.rels';
$relsContent = $zip->getFromName($relsPath);
$relsXml = new SimpleXMLElement($relsContent);
$imagePaths = [];
foreach ($relsXml->Relationship as $relationship) {
if (strpos($relationship['Type'], 'image') !== false) {
// Store the original image path
$imagePaths[] = 'word/' . $relationship['Target'];
// Replace the image target with a placeholder image reference
$placeholderImageTarget = 'media/placeholder.png';
$relationship['Target'] = $placeholderImageTarget;
}
}
// Update the relationships file
$zip->deleteName($relsPath);
$zip->addFromString($relsPath, $relsXml->asXML());
// Delete the original image files
foreach ($imagePaths as $imagePath) {
$zip->deleteName($imagePath);
}
// Add the placeholder image to the zip archive
$zip->addFile($placeholderImagePath, 'word/' . $placeholderImageTarget);
}
private function getPlaceholderImage()
{
$placeholderImagePath = 'placeholder.png';
if (!Storage::disk('local')->exists($placeholderImagePath)) {
$width = 1;
$height = 1;
$color = [255, 255, 255]; // RGB value for white color
$image = imagecreatetruecolor($width, $height);
$color = imagecolorallocate($image, $color[0], $color[1], $color[2]);
imagefilledrectangle($image, 0, 0, $width - 1, $height - 1, $color);
ob_start();
imagepng($image);
$imageData = ob_get_contents();
ob_end_clean();
Storage::disk('local')->put($placeholderImagePath, $imageData);
}
return storage_path('app/' . $placeholderImagePath);
}
Then
$tempFilePath = tempnam(sys_get_temp_dir(), 'doc');
file_put_contents($tempFilePath, $response->getBody()->getContents());
$zip = new ZipArchive();
$placeholderImagePath = $this->getPlaceholderImage();
$zip->open($tempFilePath);
$this->removeImageReferences($zip, $placeholderImagePath);
$zip->close();
$phpWord = IOFactory::load($tempFilePath);
In the unlikely event that this is going to be fixed at anytime soon due to what seems to be poor support of EMF images with PHP, is it worth catching this error and replacing the image with a placeholder 'can't be found image/message'?
Then, at least the library can be used for any documents which use an EMF image.
So, PHP getimagesize and getimagesizefromstring accept the following formats https://www.php.net/manual/fr/image.constants.php
It is not including emf file (neither svg...).
So this could be a PHP Feature Request, but in the meantime, we could try to implement it "PHP like" on PHPWord.
In Php code: PHP_FUNCTION(getimagesize) { php_getimagesize_from_any(INTERNAL_FUNCTION_PARAM_PASSTHRU, FROM_PATH); } / }}} /
/ {{{ Get the size of an image as 4-element array / PHP_FUNCTION(getimagesizefromstring) { php_getimagesize_from_any(INTERNAL_FUNCTION_PARAM_PASSTHRU, FROM_DATA); }
It then get the stream, and call php_getimagesize_from_stream
To know which kind of file it is, it call then php_getimagesize_from_stream
For each kind of defined type, it check a specific number of bytes, and then the corresponding content.
For example, for jpeg, the 3 first bytes should be PHPAPI const char php_sig_jpg[3] = {(char) 0xff, (char) 0xd8, (char) 0xff};
Then it apply a image type specific function to get the related image size. For example, for PSD image type;
"static struct gfxinfo php_handle_psd (php_stream stream) { struct gfxinfo *result = NULL; unsigned char dim[8];
if (php_stream_seek(stream, 11, SEEK_CUR))
return NULL;
if (php_stream_read(stream, (char*)dim, sizeof(dim)) != sizeof(dim))
return NULL;
result = (struct gfxinfo *) ecalloc(1, sizeof(struct gfxinfo));
result->height = (((unsigned int)dim[0]) << 24) + (((unsigned int)dim[1]) << 16) + (((unsigned int)dim[2]) << 8) + ((unsigned int)dim[3]);
result->width = (((unsigned int)dim[4]) << 24) + (((unsigned int)dim[5]) << 16) + (((unsigned int)dim[6]) << 8) + ((unsigned int)dim[7]);
return result;
}"
Or for BMP file "static struct gfxinfo php_handle_bmp (php_stream stream) { struct gfxinfo *result = NULL; unsigned char dim[16]; int size;
if (php_stream_seek(stream, 11, SEEK_CUR))
return NULL;
if (php_stream_read(stream, (char*)dim, sizeof(dim)) != sizeof(dim))
return NULL;
size = (((unsigned int)dim[ 3]) << 24) + (((unsigned int)dim[ 2]) << 16) + (((unsigned int)dim[ 1]) << 8) + ((unsigned int) dim[ 0]);
if (size == 12) {
result = (struct gfxinfo *) ecalloc (1, sizeof(struct gfxinfo));
result->width = (((unsigned int)dim[ 5]) << 8) + ((unsigned int) dim[ 4]);
result->height = (((unsigned int)dim[ 7]) << 8) + ((unsigned int) dim[ 6]);
result->bits = ((unsigned int)dim[11]);
} else if (size > 12 && (size <= 64 || size == 108 || size == 124)) {
result = (struct gfxinfo *) ecalloc (1, sizeof(struct gfxinfo));
result->width = (((unsigned int)dim[ 7]) << 24) + (((unsigned int)dim[ 6]) << 16) + (((unsigned int)dim[ 5]) << 8) + ((unsigned int) dim[ 4]);
result->height = (((unsigned int)dim[11]) << 24) + (((unsigned int)dim[10]) << 16) + (((unsigned int)dim[ 9]) << 8) + ((unsigned int) dim[ 8]);
result->height = abs((int32_t)result->height);
result->bits = (((unsigned int)dim[15]) << 8) + ((unsigned int)dim[14]);
} else {
return NULL;
}
return result;
}"
So, we could implement a glue, that can rely on the file name (.xxx) or on the first byte definition for EMF, and then retrieve the related content from the specification.
More precisely "1.3.1 Metafile Structure An EMF metafile begins with a EMR_HEADER record (section 2.3.4.2), which includes the metafile version, its size, the resolution of the device on which the picture was created, and it ends with an EMR_EOF record (section 2.3.4.1). Between them are records that specify the rendering of the image."
And then "2.3.4.2 EMR_HEADER Record Types The EMR_HEADER record is the starting point of an EMF metafile. It specifies properties of the device on which the image in the metafile was recorded; this information in the header record makes it possible for EMF metafiles to be independent of any specific output device. The following are the EMR_HEADER record types. Name Section Description EmfMetafileHeader 2.3.4.2.1 The original EMF header record. EmfMetafileHeaderExtension1 2.3.4.2.2 The header record defined in the first extension to EMF, which added support for OpenGL records and an optional internal pixel format descriptor.<62> EmfMetafileHeaderExtension2 2.3.4.2.3 The header record defined in the second extension to EMF, which added the capability of measuring display dimensions in micrometers.<63> EMF metafiles SHOULD be created with an EmfMetafileHeaderExtension2 header record. The generic structure of EMR_HEADER records is specified as follows. ... Type (4 bytes): An unsigned integer that identifies this record type as EMR_HEADER. This value is 0x00000001 ... The value of the Size field can be used to distinguish between the different EMR_HEADER record types listed earlier in this section. There are three possible headers: The EmfMetafileHeader record. The fixed-size part of this header is 88 bytes, and it contains a Header object (section 2.2.9). The EmfMetafileHeaderExtension1 record. The fixed-size part of this header is 100 bytes, and it contains a Header object and a HeaderExtension1 object (section 2.2.10). The EmfMetafileHeaderExtension2 record. The fixed-size part of this header is 108 bytes, and it contains a Header object, a HeaderExtension1 object, and a HeaderExtension2 object (section 2.2.11)."
Then in 2.2.9 "Bounds (16 bytes): A RectL object ([MS-WMF] section 2.2.2.19) that specifies the rectangular inclusive-inclusive bounds in logical units of the smallest rectangle that can be drawn around the image stored in the metafile."
Which get us in https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-wmf/4813e7fd-52d0-4f42-965f-228c8b7488d2 section 2.2.2.19 "2.2.2.19 RectL Object The RectL Object defines a rectangle. ... Left (4 bytes): A 32-bit signed integer that defines the x coordinate, in logical coordinates, of the upper-left corner of the rectangle. Top (4 bytes): A 32-bit signed integer that defines the y coordinate, in logical coordinates, of the upper-left corner of the rectangle. Right (4 bytes): A 32-bit signed integer that defines the x coordinate, in logical coordinates, of the lower-right corner of the rectangle. Bottom (4 bytes): A 32-bit signed integer that defines y coordinate, in logical coordinates, of the lower-right corner of the rectangle. A rectangle defined with a RectL Object is filled up to— but not including—the right column and bottom row of pixels"
Hi Progi1984,
I hadn't the time to install the whole environment to be able to test looking to the project standards, but i wrote a glue for getimagesize that is working on my environment.
As the specification is a little bit painful, i copy below the function, hoping it could help you in managing this ticket.
"/**
But this only solve the CheckImage Problem.
There is also another problem on parseImage on PhpWord/Shared/Html.php on line 960
My Bad, the image type should also be modified
I got around this a year ago, this never bothered me again. I prepare any docx via the method 2 i enumerate here https://github.com/PHPOffice/PHPWord/issues/1480#issuecomment-1278708204
Well, EMF to JPEG is not a lossless conversion.
That's why i updated PHPWord to manage emf image. But you're right that if you don't mind about image quality, your solution is a good workaround.
Someone has a file with EMF/WMF file, please ?
I have one, but it is my customer one, so it can't be used like that.
So i used the trial version of the Metafile Companion Software, and then produce a random image that i inserted on a random docx file. Docx with Emf Image for Test.docx
Hope it helps
This is:
Expected Behavior
Support EMF image.
Failure Information
Throws PhpOffice\PhpWord\Exception\InvalidImageException exception. Exception message : Invalid image: zip:///Users/xxx/Downloads/xxxx.docx#word/media/image.emf
0 /works/shared/laravel/vendor/phpoffice/phpword/src/PhpWord/Element/Image.php(149): PhpOffice\PhpWord\Element\Image->checkImage()
1 [internal function]: PhpOffice\PhpWord\Element\Image->__construct('zip:///Users/hu...', NULL, false, 'Picture 18')
How to Reproduce
Document file contains emf format images. Google emf I got this page: https://fileinfo.com/extension/emf
Context