james-see / iptcinfo3

iptcinfo working for python 3 finally do pip3 install iptcinfo3
51 stars 31 forks source link

Keyword / Tags issue? #31

Closed bigbluemonster closed 11 months ago

bigbluemonster commented 3 years ago

When using:

`from iptcinfo3 import IPTCInfo

fn = 'Getty-Original.jpg' info = IPTCInfo(fn) print(info['keywords'])`

I am getting the following output:

[b'Capital Cities,Travel,Tourism,Building Exterior,Cloudscape,City ']

However, right clicking the image, going into properties and the details tab, the tags are listed as:

Capital Cities; Travel; Tourism; Building Exterior; Cloudscape; City Life; Central Berlin; Berlin Cathedral; Alexanderplatz; Spree River; Downtown District; Nikolaiviertel; Television Tower - Berlin; Dusk; Twilight; Skyscraper; Blue; German Culture; Cultures; Famous Place; Architecture; Travel Destinations; Vacations; Urban Scene; Panoramic; Aerial View; Berlin; Germany; Europe; Tree; Sunlight; Sunset; Light - Natural Phenomenon; Summer; Cloud - Sky; Sun; Sky; River; Water; Cathedral; Church; Street; Bridge - Man Made Structure; Tower; Built Structure; Urban Skyline; Cityscape; City; Town; Nautical Vessel; Sunny; Capital Cities,Travel,Tourism,Building Exterior,Cloudscape,City;

Is there a reason why this occuring and it is not reading each tag? If I remove the line:

Capital Cities,Travel,Tourism,Building Exterior,Cloudscape,City;

Then each tag shows. If I remove that line and add in:

Just,Testing,This,One,Tag,Line;

Then my output is:

[b'Capital Cities', b'Travel', b'Tourism', b'Building Exterior', b'Cloudscape', b'City Life', b'Central Berlin', b'Berlin Cathedral', b'Alexanderplatz', b'Spree River', b'Downtown District', b'Nikolaiviertel', b'Television Tower - Berlin', b'Dusk', b'Twilight', b'Skyscraper', b'Blue', b'German Culture', b'Cultures', b'Famous Place', b'Architecture', b'Travel Destinations', b'Vacations', b'Urban Scene', b'Panoramic', b'Aerial View', b'Berlin', b'Germany', b'Europe', b'Tree', b'Sunlight', b'Sunset', b'Light - Natural Phenomenon', b'Summer', b'Cloud - Sky', b'Sun', b'Sky', b'River', b'Water', b'Cathedral', b'Church', b'Street', b'Bridge - Man Made Structure', b'Tower', b'Built Structure', b'Urban Skyline', b'Cityscape', b'City', b'Town', b'Nautical Vessel', b'Sunny', b'Just,Testing,This,One,Tag,Line']

I am wondering if there is something wrong with the tags in the original image, and if so is there a way to detect this? I have to run this over a few hundred images to pull out the tags.

The image I am using is this:

Getty-Image

james-see commented 2 years ago

Yeah I am wondering if there is a carriage return or other character throwing it off. I can test out adding in an extra function that does some more cleaning, but I dont want to end up ruining the original tags intent.

james-see commented 2 years ago

@bigbluemonster I was able to repeat the issue myself. I tested it here: https://getpmd.iptc.org/upload and I see what the issue is I believe. What you are seeing is the IIM - IPTC Photo Metadata Fields vs. the XMP Photo Metadata Fields that include the entire list. The IIM are stricter I believe and XMP is a subset of Adobe bs. and can be a lot more arbitrary. So iptcinfo3 is actually showing you the correct thing.

james-see commented 11 months ago

Closing this issue.