strukturag / libheif

libheif is an HEIF and AVIF file format decoder and encoder.
Other
1.76k stars 302 forks source link

Iptc metadata #124

Open Arthur111 opened 5 years ago

Arthur111 commented 5 years ago

I saw heif_context_add_exif_metadata and heif_context_add_XMP_metadata but nothing about IPTC.

I know its replaced by XMP but there is still pictures with this metadata inside.

heif_image_handle_get_metadata_content_type gives "application/rdf+xml"

How can i add this xml ?

see : http://metadatadeluxe.pbworks.com/w/page/20792260/Photoshop%20Panels%20-%20IPTC%20and%20ARTstor

Thank you

Arthur111 commented 5 years ago

I mean, how can i convert this picture ( https://www.photograpix.fr/photograpix/wp-content/uploads/2012/08/BoxeThai-600x400.jpg ) from jpeg to heif using libheif with ALL metadatas ?

Thank you

avibrazil commented 5 years ago

I'm also interested on this.

farindk commented 5 years ago

I am not aware of any way to store IPTC natively in HEIF images. The preferred way to do this seems to be to convert the data to XMP and then embed that into the HEIF.

farindk commented 5 years ago

BTW: the image BoxeThai-600x400.jpg does not seem to contain any metadata...

Arthur111 commented 5 years ago

The image link : https://www.photograpix.fr/photograpix/wp-content/uploads/2012/08/BoxeThai.jpg you can download and see the IPTC metadata.

farindk commented 5 years ago

Yes, thanks. The new link contains the metadata.

cgilles commented 5 years ago

Hi, i vote also for IPTC chunk support. converting IPTC to XMP is always not safe, as some tags do not have equivalent in XMP. For ex, IPTC has a preview tag to store reduce version of image in JPEG, which do not exists in XMP.

ImageMagick, do the same with PNG, which do not support Exif and IPTC in standard. When you convert JPEG to PNG, exif, iptc and xmp byte-array are copied in separated chunks to preserve all metadata. Converting back from PNG to JPEG will restore all metadata. To do this job, IM create customized chunk in PNG (this file format permit this).

I will be great if libheif can do the same for IPTC byte-array.

Best

Gilles Caulier digiKam maintainer who has just finalized HEIF support https://www.reddit.com/r/kde/comments/dc7zt5/digikam_640_will_support_heic_image_format_with/?ref=share&ref_source=link

cgilles commented 5 years ago

In Nokia HEIF specifications, it's clear:

Extensible to other metadata formats = Yes

https://nokiatech.github.io/heif/technical.html#download

So there is no reason to not support IPTC chunk

Gilles Caulier

farindk commented 5 years ago

In Nokia HEIF specifications, it's clear: Extensible to other metadata formats = Yes

Yes, technically, it's not a problem. However, we should avoid that every software defines its own, proprietary way to store IPTC data. So, this should be coordinated first with other software authors. Anyone with an idea who to contact from the IPTC group about this?

For ex, IPTC has a preview tag to store reduce version of image in JPEG, which do not exists in XMP.

As preview image, you probably should use the HEIC thumbnail instead, so that example does not convince me yet :-) Any other example why the XMP version of IPTC is inferior?

cgilles commented 5 years ago

Well, multiple page feature is not the unique solution to support preview. IPTC provide a standard way since 20 years, and we have plenty of photo supporting this metadata, as TIFF, JPEG, PNG, JPEG2000.

You said to use multipage feature to add preview of image, and not IPTC. Well Exif is supported in HEIF, even if thumbnail is embed in this chunk, in opposite of HEIF thumbnail image.

XMP is not universal. Even if XMP SDK from Adobe is published as an open source like tarball, XMP support still optional as some distro drop this package officially because the license is not clear. I talk about Exiv2 library used to parse and edit Exif, Iptc, and XMP metadata.

So, not supporting Iptc is problematic for me.

Gilles Caulier

cgilles commented 5 years ago

Another point : if libheif team don't want to add IPTC support as well, how i can do it with libheif API as extra chuck to host in HEIF container ? Something like heif_context_add_extra_metadata() with a custom chunk data argument, and a string identifier for the chunk. The inverse method also will help.

fancycode commented 5 years ago

As @farindk mentioned above, there is no standardized way to store IPTC metadata in HEIF files. We surely could add something proprietary but that will not work with other apps supporting HEIF files. Instead it should be coordinated with the IPTC group to have a defined way to store this metadata that then can also be used by other apps.

farindk commented 5 years ago

We could indeed add an API to store and read custom boxes. It's not that easy because of handling the cross-references between boxes (e.g. indicating to which image some metadata belongs), but it can be done and would also allow to handle other cases (e.g. the Samsung S10 images from Jos).

Then, @cgilles is free to save anything in his proprietary way and cause trouble for his users :-)

The IPTC group is very closed, unfortunately. One has to be a paying member (4740 EUR/y) before they even start listening. So changes are low that we can just define THE standard way to save IPTC without a lot of bureaucracy.

fancycode commented 5 years ago

The IPTC group is very closed, unfortunately.

I'm assuming the ISO that defined the HEIF standard is not much better...

cgilles commented 5 years ago

yes, IPTC group is not open-source like team. Welcome in real world...

I know that ImageMagick as add IPTC support to PNG by creating a non standard chunk well identified and well supported by all metadata low level framework, as Exiv2 or ExifTool.

There is also one other way to host IPTC in HEIF : Exif. Standard Exif has a tags to host IPTC byte array as well. This will grow the Exif chuck of course and make exif incompatible with other file format in case of conversion if no rues is written to guard exif chuck. JPEG for ex, only support a chunk of 65535 bytes. I seen ImageMagick code to limit HEIF writing chunk to this size to prevent compatibility issue.

Q : which are the size limit of an HEIF chunk exactly ?

Gilles Caulier

farindk commented 5 years ago

There is no practical limit on the HEIF box size (64bit length). I know that JPEG APP chunks are <64k, but data can be distributed over several consecutive chunks.

@cgilles: as you probably have more experience with IPTC than me, could you make a proposal how such a chunk could look like? Preferably, we should be as close as possible to the PNG chunk format. However, I did a quick search for any spec for IPTC in PNG and also could not find anything.

We have a partner company that is using HEIF and IPTC on a large scale. I'll also ask them about their opinion.

farindk commented 5 years ago

I did some more searching about IPTC (IIM) vs. XMP, and it seems that XMP superseeded the IIM stream format (e.g. see here https://en.wikipedia.org/wiki/IPTC_Information_Interchange_Model).

It appears that for other formats, software had to make sure to keep IIM and XMP in sync. However, with a new format like HEIF, I do not think it is wise to keep the old IIM stream simply because it is easy to just copy it over from JPEG. Instead it should be converted to XMP. If there is any information that would be lost during this conversion, we have to think about it, but so far, I did not see a convincing case.

There seems to be a guideline document about how different sources of metadata (EXIF, XMP, IPTC) should be handled, but the server is currently down: http://www.metadataworkinggroup.org/pdf/mwg_guidance.pdf

Here is the IIM standard and the mapping to XMP: https://iptc.org/standards/iim/

fancycode commented 5 years ago

So the solution would be to support XMP metadata (which we already do afaik) and have a small wrapper that can convert between IPTC and XMP? That wrapper wouldn't even have to be part of libheif and maybe there is existing code.

cgilles commented 5 years ago

IPTC is very simple to host in file : it's just a byte array, as Exif. IPTC is like Exif, and old school binary format. So to put/get on HEIF, just do the same than Exif using a special ID.

Converting IPTC to XMP is delegate to metadata libraries, as Exiv2 (that we use in digiKam). This kind of job is already implemented partially in Exiv2, but to be honest, it's a complex job with multiple cases depending of proprietary software uses. So, IPTC to XMP must be always a not safe operation and by experience it's so far a not recommend to process without to have a backup in target file. This is typically the original request from this file. In digiKam, when we convert IPTC to XMP, we always store original IPTC has well, in case of.

And don't forget that XMP is not always supported, as it's optional due to Adobe XMP SDK licence, not always accepted by distro packagers. In Exiv2 (and digiKam), XMP is optional at compilation time. So in this kind of situation supporting IPTC is the only way to convert file to HEIF without to lost metadata.

Gilles Caulier

farindk commented 5 years ago

Please have a look into the 'iptc' branch, I added. There is a new API function 'add_generic_metadata()' which you could use to attach IPTC streams as another chunk for metadata. I suggest that you use the item_type="iptc" and content_type=NULL.

Reading the data back works via enumerating the metadata blocks and checking heif_image_handle_get_metadata_type() against "iptc".

cgilles commented 5 years ago

Hi,

I backported your IPTC branch to digiKAm HEIF image loader, and :

1/ writing sound working :

digikam.general: startSavingAs called digikam.general: Writing file to QUrl("file:///mnt/data/photos/TESTS/METADATA/LR/LR.heic") digikam.widgets: Trying to discover format based on filename ' "LR.heic" ', fallback = 0 digikam.widgets: Discovered format: 6 digikam.widgets: Format selected: "LR.heic" digikam.widgets: Trying to discover format based on filename ' "LR.heic" ', fallback = 0 digikam.widgets: Discovered format: 6 digikam.general: Trying to find a saving format from targetUrl = QUrl("file:///mnt/data/photos/TESTS/METADATA/LR/LR.heic") digikam.general: Qt Offered types: ".bmp .bw .cur .eps .epsf .epsi .icns .ico .pbm .pcx .pgm .pic .png .ppm .rgb .rgba .sgi .tga .wbmp .webp .xbm .xpm .tiff .tif .jpg .jpeg .jpe .jp2 .j2k .jpx .pgx .pgf .heic .heif" digikam.general: Writable formats: ("bmp", "bw", "cur", "eps", "epsf", "epsi", "icns", "ico", "pbm", "pcx", "pgm", "pic", "png", "ppm", "rgb", "rgba", "sgi", "tga", "wbmp", "webp", "xbm", "xpm", "tiff", "tif", "jpg", "jpeg", "jpe", "jp2", "j2k", "jpx", "pgx", "pgf", "heic", "heif") digikam.general: Possible format from local file: "heic" digikam.general: Using format from target url "heic" digikam.geoiface: ---- digikam.general: Saving to : "/mnt/data/photos/TESTS/METADATA/LR/EditorWindow-YldHVH.digikamtempfile.heic" ( "heic" ) digikam.general: Saving file "/mnt/data/photos/TESTS/METADATA/LR/EditorWindow-YldHVH.digikamtempfile.heic" at -1 digikam.dimg: Prepare Metadata to save for "/mnt/data/photos/TESTS/METADATA/LR/LR.heic" digikam.metaengine: JPEG image preview size: ( 1280 x 851 ) pixels - 259558 bytes digikam.dimg: Saving to "/mnt/data/photos/TESTS/METADATA/LR/EditorWindow-YldHVH.digikamtempfile.heic" with format: "heic" Check HEVC encoder for 16 bits encoding... Check HEVC encoder for 14 bits encoding... Check HEVC encoder for 12 bits encoding... Check HEVC encoder for 10 bits encoding... Check HEVC encoder for 8 bits encoding... HEVC encoder max bits depth: 8 HEVC encoder setup... HEIF set color profile... Stored HEIF color profile size: 3144 HEIF setup data plane... HEIF data container: 0x7f8824232010 HEIF bytes per line: 9024 HEIF output bytes per color: 3 HEIF 16 to 8 bits coeff. : 8 HEIF 8 to 16 bits coeff. : 0 HEIF master image encoding... HEIF preview storage in thumbnail chunk... HEIF metadata storage... Stored HEIF Exif data size: 6998 Stored HEIF Iptc data size: 259696 Stored HEIF Xmp data size: 354154 HEIF flush to file...

2/ Loading do not work : only 2 chunk are found (exif and xmp) :

digikam.dimg: "/mnt/data/photos/TESTS/METADATA/LR/LR.heic" : "HEIF" file identified Found 2 HEIF metadata chunck Parsing HEIF metadata chunck: Exif HEIF exif container found with size: 6998 Parsing HEIF metadata chunck: mime HEIF xmp container found with size: 354154 HEIF color profile found with size: 3144 HEIF image size: ( 3008 x 2000 ) Decoded HEIF image properties: size( 3008 x 2000 ), Alpha: false , Color depth : 8 HEIF data container: 0x7fd245385010 HEIC bytes per line: 9024 Color bytes depth: 8 Color multiplier: 1

HEIF file generated from JPEG file is available here :

https://drive.google.com/open?id=1GPI7RGlrL6uIusEfyDbmxxJeOL-9YLRE

(uploading is under progress)

Gilles Caulier

cgilles commented 5 years ago

More info:

Code to load metadata:

https://invent.kde.org/kde/digikam/blob/master/core/dplugins/dimg/heif/dimgheifloader_load.cpp#L174

Code to write metadata:

https://invent.kde.org/kde/digikam/blob/master/core/dplugins/dimg/heif/dimgheifloader_save.cpp#L457

Gilles Caulier

farindk commented 5 years ago

Yes, right, I forgot something. Please get the latest code from the 'iptc' branch and run it again.

cgilles commented 5 years ago

Excellent. It work perfectly now. Thanks you very much !

https://imgur.com/WkiZqPP

Gilles Caulier

Arthur111 commented 5 years ago

Seems according to Mr Caulier this issue is solved. Thank you to him. So this modification can now be merged with the master and i will close this issue.

cgilles commented 5 years ago

yes i confirm this file can be closed when code will be merged back from iptc branch to master

Gilles Caulier

farindk commented 2 years ago

There is currently some discussion about adding IPTC officially to the HEIF specification: https://github.com/MPEGGroup/FileFormat/issues/54

@cgilles: could you please give us your experience on this from your implementation in DigiKam? Is the IPTC metadata that you store in HEIF files actually used? How disturbing would it be if the ID of the IPTC metadata in HEIF files would change?

PS: I have quickly looked into the DigiKam source code. I see the code for writing IPTC (digikam/core/dplugins/dimg/heif/dimgheifloader_save.cpp:570), but I could not find the code that is reading it back.

cgilles commented 2 years ago

Hi,

Reading IPTC (as Exif, and XMP) is performed by Exiv2 library. This last one do not yet support writing operations in HEIF.

It's not a problem if a standard way is provided to store IPTC to HEIF. The metadata saved in HEIF by digiKam are for backup purpose. So don't hesitate to use a standard way for IPTC, we will adjust code in digiKam accordingly.

Actually, we embed libheif in digiKam core for the moment, and we need to use an updated and external dependency instead.

Best

Gilles Caulier

farindk commented 2 years ago

Thanks, @cgilles. Yes, it also seems fine to me to still replace the iptc ID with an official one. Especially since Exiv2 has not merged heif support yet into their master branch. I would probably provide an API function to find the iptc data which will accept both IDs for backwards compatibility.

avibrazil commented 2 years ago

Meanwhile exiftool has full HEIF support for reading and writing all types of tags.

farindk commented 2 years ago

@avibrazil According to https://exiftool.org/#supported, Exiftool does not support IPTC in HEIF.

avibrazil commented 2 years ago

Hummmm, maybe you are right. I need to check my tagged HEIFs to confirm what I've achieved. But I can assure you that exiftool now support XMP writing to HEIF. XMP is a broader and more modern standard than IPTC, and as far I can remember, all IPTC tags are also supported by XMP.

cgilles commented 2 years ago

No not all IPTC tags are supported by XMP, even if 85% are. This is why IPTC still important to support for interoperability. Of course XMP must be used instead if possible...