Frameright / php-image-metadata-parser

PHP image metadata parsing library (XMP, IPTC, Exif)
https://docs.frameright.io/php
MIT License
9 stars 1 forks source link

Failing to read image regions from image with additional metadata #25

Open klaari opened 5 months ago

klaari commented 5 months ago

I'm not able to read image regions from this specific image: [removed]

Seems like a conflict with metadata from other tools.

If I delete all metadata (with exiftool -all= -overwrite_original) and create the regions again with Frameright app everything works fine. Also reading the original image with your typescript library works fine.

How to reproduce:

$jpeg = JPEG::fromFile(
    __DIR__ . '/../Fixtures/metadata_failing.jpg');

$xmp = $jpeg->getXmp();
var_dump($xmp->getImageRegions());        
// returns empty array instead of image regions

exiftool shows that the regions exists in the metadata:

xifTool Version Number         : 11.88
File Name                       : metadata_failing.jpg
Directory                       : tests/Fixtures
File Size                       : 5.0 MB
File Modification Date/Time     : 2024:04:11 08:37:27+03:00
File Access Date/Time           : 2024:04:11 09:59:47+03:00
File Inode Change Date/Time     : 2024:04:11 09:59:30+03:00
File Permissions                : rw-rw-r--
File Type                       : JPEG
File Type Extension             : jpg
MIME Type                       : image/jpeg
JFIF Version                    : 1.01
Comment                         : PARIS, FRANCE - MARCH 05: Christine Centenera wears a blue with white embroidered pattern cap from Balenciaga, black sunglasses, a white t-shirt, a beige long / belted with shoulder-pads coat, black shiny leather pointed / heels knees boots / high boots, a gold ring, outside Vivienne Westwood , during Paris Fashion Week - Womenswear F/W 2022-2023, on March 05, 2022 in Paris, France. (Photo by Edward Berthelot/Getty Images)
X Resolution                    : 300
Displayed Units X               : inches
Y Resolution                    : 300
Displayed Units Y               : inches
Current IPTC Digest             : 427f4f1de07eaf1ec0da65d7181c8707
Coded Character Set             : UTF8
Envelope Record Version         : 4
By-line                         : Edward Berthelot
By-line Title                   : Contributor
Caption-Abstract                : PARIS, FRANCE - MARCH 05: Christine Centenera wears a blue with white embroidered pattern cap from Balenciaga, black sunglasses, a white t-shirt, a beige long / belted with shoulder-pads coat, black shiny leather pointed / heels knees boots / high boots, a gold ring, outside Vivienne Westwood , during Paris Fashion Week - Womenswear F/W 2022-2023, on March 05, 2022 in Paris, France. (Photo by Edward Berthelot/Getty Images)
Writer-Editor                   : XX / XX
Copyright Notice                : 2022 Edward Berthelot
Country-Primary Location Name   : France
Country-Primary Location Code   : FRA
Special Instructions            : Not Released (NR)
Keywords                        : apple, burgundy lipstick, elegant, fashion blogger, fashion outfit, ginger hair, long hiar, outfit, paris, ready-to-wear, rtw, smartphone, spring outfit, style, woman
Original Transmission Reference : 775778623
Object Name                     : 1381031758
Application Record Version      : 4
Exif Byte Order                 : Little-endian (Intel, II)
Image Description               : PARIS, FRANCE - MARCH 05: Christine Centenera wears a blue with white embroidered pattern cap from Balenciaga, black sunglasses, a white t-shirt, a beige long / belted with shoulder-pads coat, black shiny leather pointed / heels knees boots / high boots, a gold ring, outside Vivienne Westwood , during Paris Fashion Week - Womenswear F/W 2022-2023, on March 05, 2022 in Paris, France. (Photo by Edward Berthelot/Getty Images)
Resolution Unit                 : inches
Software                        : Adobe Photoshop Lightroom Classic 11.2 (Macintosh)
Modify Date                     : 2022:03:06 02:00:17
Exif Version                    : 0232
Date/Time Original              : 2022:03:05 12:53:47
Create Date                     : 2022:03:05 12:53:47
Offset Time                     : +01:00
Offset Time Original            : +01:00
Offset Time Digitized           : +01:00
Sub Sec Time Original           : 864
Sub Sec Time Digitized          : 864
Color Space                     : Uncalibrated
Profile CMM Type                : Linotronic
Profile Version                 : 2.1.0
Profile Class                   : Display Device Profile
Color Space Data                : RGB
Profile Connection Space        : XYZ
Profile Date Time               : 1998:02:09 06:49:00
Profile File Signature          : acsp
Primary Platform                : Microsoft Corporation
CMM Flags                       : Not Embedded, Independent
Device Manufacturer             : Hewlett-Packard
Device Model                    : sRGB
Device Attributes               : Reflective, Glossy, Positive, Color
Rendering Intent                : Perceptual
Connection Space Illuminant     : 0.9642 1 0.82491
Profile Creator                 : Hewlett-Packard
Profile ID                      : 0
Profile Copyright               : Copyright (c) 1998 Hewlett-Packard Company
Profile Description             : sRGB IEC61966-2.1
Media White Point               : 0.95045 1 1.08905
Media Black Point               : 0 0 0
Red Matrix Column               : 0.43607 0.22249 0.01392
Green Matrix Column             : 0.38515 0.71687 0.09708
Blue Matrix Column              : 0.14307 0.06061 0.7141
Device Mfg Desc                 : IEC http://www.iec.ch
Device Model Desc               : IEC 61966-2.1 Default RGB colour space - sRGB
Viewing Cond Desc               : Reference Viewing Condition in IEC61966-2.1
Viewing Cond Illuminant         : 19.6445 20.3718 16.8089
Viewing Cond Surround           : 3.92889 4.07439 3.36179
Viewing Cond Illuminant Type    : D50
Luminance                       : 76.03647 80 87.12462
Measurement Observer            : CIE 1931
Measurement Backing             : 0 0 0
Measurement Geometry            : Unknown
Measurement Flare               : 0.999%
Measurement Illuminant          : D65
Technology                      : Cathode Ray Tube Display
Red Tone Reproduction Curve     : (Binary data 2060 bytes, use -b option to extract)
Green Tone Reproduction Curve   : (Binary data 2060 bytes, use -b option to extract)
Blue Tone Reproduction Curve    : (Binary data 2060 bytes, use -b option to extract)
XMP Toolkit                     : Frameright XMP Toolkit 2.0.0
Asset ID                        : 1381031758
Dlref                           : 95yWeO3K9FRUOZUV93Ok9g==
Image Rank                      : 2
Metadata Date Xmlns             : http://ns.adobe.com/xap/1.0/
Metadata Date                   : 2024:04:09 10:47:51.852Z
Modify Date Xmlns               : http://ns.adobe.com/xap/1.0/
Country Code                    : FRA
Person In Image                 : Christine Centenera
Google Vision                   : 
Creator                         : Edward Berthelot
Description                     : PARIS, FRANCE - MARCH 05: Christine Centenera wears a blue with white embroidered pattern cap from Balenciaga, black sunglasses, a white t-shirt, a beige long / belted with shoulder-pads coat, black shiny leather pointed / heels knees boots / high boots, a gold ring, outside Vivienne Westwood , during Paris Fashion Week - Womenswear F/W 2022-2023, on March 05, 2022 in Paris, France. (Photo by Edward Berthelot/Getty Images)
Format                          : image/jpeg
Rights                          : 2022 Edward Berthelot
Subject                         : apple, burgundy lipstick, elegant, fashion blogger, fashion outfit, ginger hair, long hiar, outfit, paris, ready-to-wear, rtw, smartphone, spring outfit, style, woman
Title                           : 1381031758
Elvis ID                        : 8Mw0Nija4T38Sf6nY6lt-a
Author                          : Edward Berthelot
Copyright                       : 2022 Edward Berthelot
Authors Position                : Contributor
Caption Writer                  : XX / XX
Category                        : E
City                            : Paris
Copyright Flag                  : true
Country                         : France
Credit                          : Getty Images
Date Created                    : 2022:03:05 00:00:00+00:00
Headline                        : Street Style : Day Six - Paris Fashion Week - Womenswear F/W 2022-2023
Instructions                    : Not Released (NR)
Source                          : Getty Images Europe
Supplemental Categories         : CEL
Supplemental Category           : CEL
Transmission Reference          : 775778623
Url                             : https://www.gettyimages.com
Urgency                         : 2
Licensor URL                    : https://www.gettyimages.com/eula?utm_medium=organic&utm_source=google&utm_campaign=iptcurl
Terms And Conditions URL        : https://www.gettyimages.com/eula?utm_medium=organic&utm_source=google&utm_campaign=iptcurl
Keyword                         : apple, burgundy lipstick, elegant, fashion blogger, fashion outfit, ginger hair, long hiar, outfit, paris, ready-to-wear, rtw, smartphone, spring outfit, style, woman
Credit Line                     : Getty Images
Creator Tool                    : Edward Berthelot
Rating                          : 0
Document ID                     : xmp.did:e1c022c1-dfca-47cf-8c93-1f80457ddb53
Instance ID                     : xmp.iid:e1c022c1-dfca-47cf-8c93-1f80457ddb53
Original Document ID            : xmp.did:e1c022c1-dfca-47cf-8c93-1f80457ddb53
Web Statement                   : https://www.gettyimages.com/eula?utm_medium=organic&utm_source=google&utm_campaign=iptcurl
Image Region Boundary Shape     : Rectangle, Rectangle, Rectangle, Rectangle, Rectangle, Rectangle
Image Region Boundary Unit      : Relative, Relative, Relative, Relative, Relative, Relative
Image Region Boundary W         : 0.7179294823705926, 0.8162040510127532, 0.8447111777944486, 0.6882970742685671, 0.9167291822955739, 0.822205551387847
Image Region Boundary H         : 0.4785, 0.306, 0.42225, 0.81575, 0.8145, 0.287
Image Region Boundary X         : 0.11890472618154539, 0.0967741935483871, 0.08289572393098274, 0.1436609152288072, 0.041635408852213056, 0.09152288072018004
Image Region Boundary Y         : 0.13225, 0.16025, 0.1405, 0.11375, 0.11375, 0.15675
Image Region ID                 : crop-0fd40a5b-ad5f-4b29-9ab2-43afc21b44a6, crop-4c732f66-2884-4fb2-86ae-2fcc332349ae, crop-96109636-028d-455c-ac90-c0c9d1287965, crop-6198987b-009b-499d-97fb-8cbad064a4e4, crop-7802d940-6da5-424a-909b-148421ebaf40, crop-b9385090-7c1e-4125-81d0-7837af4eb4cd
Image Region Role Identifier    : http://cv.iptc.org/newscodes/imageregionrole/cropping, http://cv.iptc.org/newscodes/imageregionrole/cropping, http://cv.iptc.org/newscodes/imageregionrole/cropping, http://cv.iptc.org/newscodes/imageregionrole/cropping, http://cv.iptc.org/newscodes/imageregionrole/cropping, http://cv.iptc.org/newscodes/imageregionrole/cropping
Image Region Region Definition Id: definition-0dae7c70-f936-49ad-80d2-a9f0f6c0fcdb
Image Region Region Name        : 1:1 Square (Common sizes)
Image Width                     : 2666
Image Height                    : 4000
Encoding Process                : Baseline DCT, Huffman coding
Bits Per Sample                 : 8
Color Components                : 3
Y Cb Cr Sub Sampling            : YCbCr4:2:0 (2 2)
Image Size                      : 2666x4000
Megapixels                      : 10.7
Create Date                     : 2022:03:05 12:53:47.864+01:00
Date/Time Original              : 2022:03:05 12:53:47.864+01:00
Modify Date                     : 2022:03:06 02:00:17+01:00
lourot commented 5 months ago

Hi @klaari, sorry about that and thanks for the detailed issue. Let me have a look later today and get back to you.

lourot commented 5 months ago

I can reproduce:

$image = Image::fromFile('reproducer.jpg');
$xmp_metadata = $image->getXmp(); // I can see all the image regions, but...
$regions = $xmp_metadata->getImageRegions();
print_r($regions); // prints an empty array

Investigating...

lourot commented 5 months ago

This image contains two occurrences of <rdf:Description xmlns:Iptc4xmpExt ... and the regions are in the second one. The current PHP metadata parser didn't expect there might be several rdf:Description elements and looks only in the first one.

This is quite easy to fix, i.e. to extend the parser not to assume there will be only one rdf:Description but to iterate over all of them when looking for Iptc4xmpExt:* data. I'm writing the fix now.

@klaari can I use your image as a test fixture in order to write a unit test covering images with several rdf:Description elements? I would mention here that you own this image.

lourot commented 5 months ago

FYI the fix and the test are ready, pushed in a temporary branch: https://github.com/Frameright/php-image-metadata-parser/commit/7acf46cc54

As soon as I get green light from you about using your image as a test fixture, I'll push this to the main branch and release a new version. Thanks a lot for the detailed issue, which made it very easy for us to reproduce and investigate :pray:

klaari commented 5 months ago

Thanks for the quick reply and fix @lourot !

I'm sorry, but I do not own the image, and it appears that the license does not allow its use :/

ilu commented 5 months ago

@klaari No worries, thank you for your patience and apologies for any inconvenience :)

@lourot Great work! Let me see if during the weekend I could recreate an image file with a similar metadata setup, which we could use as a fixture instead.

lourot commented 5 months ago

Alright, no problem. What about I comment out the test for now, remove the image, and publish a new version already now to unblock @klaari ? Then we can enable the test whenever we have such an image.

ilu commented 5 months ago

@lourot Sounds good, let's do so.

lourot commented 5 months ago

1.1.2 published:

Let us know if that works for you or if you're still bumping into issues @klaari :pray:

Keeping this issue open until we have re-enabled the test.

klaari commented 5 months ago

Thank you @lourot for the super fast delivery!