Beep6581 / RawTherapee

A powerful cross-platform raw photo processing program
https://rawtherapee.com
GNU General Public License v3.0
2.95k stars 327 forks source link

Exif comment not saved to compressed TIFF files #2017

Closed Beep6581 closed 6 years ago

Beep6581 commented 9 years ago

-- UPDATE 2017-12-28 In summary, it seems that currently RT treats compressed TIFF files as if tunnel metadata was always enabled. - Morgan

Originally reported on Google Code with ID 2033

If I add an Exif comment field in the Metadata tab and then
process the RAW image into a jpeg, the comment gets passed
on to the jpeg.  I can then use it in the generation of web
galleries etc.

But if I save to 16 bit tiff (which is my usual procedure
for many reasons), the comment is dropped.  It doesn't
matter if the Metadata checkbox in Preferences|Image
Processing is on or not.  It also doesn't make a difference
if I do this from the GUI or from command line.

RawTherapee 4.0.11, Linux x86_64

Reported by nobrowser on 2013-11-07 07:41:14

Beep6581 commented 9 years ago
4.0.11.117 works fine. Tested for JPEG, TIFF 8- and 16-bit using queue and direct save.

for x in amsterdam*; do echo "$x"; exiv2 -pv "$x" 2>/dev/null | grep "Exif "; done
amsterdam direct immediately jpeg.jpg
0x010e Image        ImageDescription            Ascii      22  Exif ImageDescription
0x013b Image        Artist                      Ascii      12  Exif Artist
0x8298 Image        Copyright                   Ascii      15  Exif Copyright
0x9286 Photo        UserComment                 Ascii      22  Exif Exif.UserComment
amsterdam direct immediately tiff16.tif
0x010e Image        ImageDescription            Ascii      22  Exif ImageDescription
0x013b Image        Artist                      Ascii      12  Exif Artist
0x8298 Image        Copyright                   Ascii      15  Exif Copyright
0x9286 Photo        UserComment                 Ascii      22  Exif Exif.UserComment
amsterdam direct immediately tiff8.tif
0x010e Image        ImageDescription            Ascii      22  Exif ImageDescription
0x013b Image        Artist                      Ascii      12  Exif Artist
0x8298 Image        Copyright                   Ascii      15  Exif Copyright
0x9286 Photo        UserComment                 Ascii      22  Exif Exif.UserComment
amsterdam queue jpeg.jpg
0x010e Image        ImageDescription            Ascii      22  Exif ImageDescription
0x013b Image        Artist                      Ascii      12  Exif Artist
0x8298 Image        Copyright                   Ascii      15  Exif Copyright
0x9286 Photo        UserComment                 Ascii      22  Exif Exif.UserComment
amsterdam queue  notforce tiff16.tif
0x010e Image        ImageDescription            Ascii      22  Exif ImageDescription
0x013b Image        Artist                      Ascii      12  Exif Artist
0x8298 Image        Copyright                   Ascii      15  Exif Copyright
0x9286 Photo        UserComment                 Ascii      22  Exif Exif.UserComment
amsterdam queue tiff16.tif
0x010e Image        ImageDescription            Ascii      22  Exif ImageDescription
0x013b Image        Artist                      Ascii      12  Exif Artist
0x8298 Image        Copyright                   Ascii      15  Exif Copyright
0x9286 Photo        UserComment                 Ascii      22  Exif Exif.UserComment
amsterdam queue tiff8.tif
0x010e Image        ImageDescription            Ascii      22  Exif ImageDescription
0x013b Image        Artist                      Ascii      12  Exif Artist
0x8298 Image        Copyright                   Ascii      15  Exif Copyright
0x9286 Photo        UserComment                 Ascii      22  Exif Exif.UserComment

Branch: default
Version: 4.0.11.117
Changeset: cea68f5e27b4
Compiler: cc 4.7.3
Processor: undefined
System: Linux
Bit depth: 64 bits
Gtkmm: V2.24.2
Build type: Release
Build flags:  -Wno-unused-result -march=native -fopenmp -O3 -DNDEBUG
Link flags:   -march=native
OpenMP support: ON
MMAP support: ON

Reported by entertheyoni on 2013-11-11 00:30:35

Beep6581 commented 9 years ago
However!

When I open these saved images in RT again, the UserComment field is empty. True for
JPEG, TIFF 8- and 16-bit.

Why when clicking Add/Edit Tag is user comment the only one with "Exif." prepended?

Reported by entertheyoni on 2013-11-11 00:34:30

Beep6581 commented 9 years ago
Hi entertheyoni :-)  Original poster here.

Have you tried this test while actually showing the *values* of the tags
not just the labels?  The reason I am asking is, I also get an
Exif.Photo.UserComment in the tiff file - but it is a long string of NUL
characters :-(

Reported by nobrowser on 2013-11-27 03:42:55

Beep6581 commented 9 years ago
Here are my results.  bug.tif and bug.jpg are the results of saving (from the GUI) the
same raw file.

 [12+1]Success_At_Tahoe$ exiv2 -pv bug.jpg | grep Comment
0x9286 Photo        UserComment                 Undefined  50  charset="Ascii" An unnamed
body of water below Dicks Peak
 [13+1]Success_At_Tahoe$ exiv2 -pv bug.tif | grep Comment
0x9286 Photo        UserComment                 Undefined 264  (Binary value suppressed)

Reported by nobrowser on 2013-11-27 04:22:50

Beep6581 commented 9 years ago

Reported by entertheyoni on 2013-12-05 02:03:51

Beep6581 commented 9 years ago

Reported by natureh.510 on 2013-12-05 16:52:46

Beep6581 commented 9 years ago
Re #3: nobrowser, what is the language, or more precisely the character set (asian,
latin) that you used to set this comment?

This Tag is special because it can handle several encoding: ASCII, JIS or UNICODE.
RT only handle ASCII here, in reading or writing, which may be the reason of the null
characters as RT convert the string to a "C" string, i.e. 7 bits latin chars, when
writing the value.

I'm looking at handling the UNICODE encoding as well, not sure about JIS.

I'm also converting all the std::string to Glib::ustring in rtexif, is anyone against
this change?

Reported by natureh.510 on 2013-12-06 03:07:10

Beep6581 commented 9 years ago
natureh, I am afraid all the comments in question were pure ASCII (as
can also be seen from my comment on Nov. 26).  So as much as I'd like to
blame Unicode as always I cannot ;)

By the way, is there any way for me to participate here with a non-Gmail
address?  I stopped using it for almost all real email some time ago,
but but to log in here I have to go through the agony of Google UI again
:(

Reported by nobrowser on 2013-12-10 07:38:59

Beep6581 commented 9 years ago
Re #8: You can participate with a non-Gmail address, as you can see at my address ;-)

Reported by heckflosse@i-weyrich.de on 2013-12-10 11:16:46

Beep6581 commented 9 years ago
Ingo, I am curious, how were you able to maintain your email address as google login?
I used to have that and at some point they forced-switched me to the google version.

Reported by michaelezra000 on 2013-12-10 11:47:33

Beep6581 commented 9 years ago
They also tried with me. But I always closed that page and then it worked as before.
Look at torger's address. He's also a non-Gmail address.

Reported by heckflosse@i-weyrich.de on 2013-12-10 14:19:40

Beep6581 commented 9 years ago
I should have resisted more too :-/ But now that I have an Android phone, getting a
gmail account is mandatory! >:(

Re #8:

I'll take your comment into account. I didn't had time to work on this yet, and don't
know when I'll be able to do it. Si I'm removing myself as the owner, if someone else
want to fix that. Otherwise we could remove this issue from the blocking list of issue
2016, since it's not a critical bug (i.e. crash), I wouldn't delay the 4.1 release
because of this.

Reported by natureh.510 on 2013-12-10 15:46:13

Beep6581 commented 9 years ago
Hombre, I'll have a look and already made some tests:

1.) when saving uncompressed TIFF from RAW, the usercomment is saved
2.) when saving a compressed TIF from an uncompressed TIF, the usercomment is also
saved
3.) when saving a compressed TIFF from RAW the usercomment is dropped.

Ingo

Reported by heckflosse@i-weyrich.de on 2013-12-10 19:02:39

Beep6581 commented 9 years ago

Reported by heckflosse@i-weyrich.de on 2013-12-10 23:09:51

Beep6581 commented 9 years ago
The symptoms here looks similar to issue 2291. 

Reported by nadvornik@suse.cz on 2014-04-02 16:27:11

Beep6581 commented 9 years ago
re #15: Did you try whether it works when you save as uncompressed TIFF?

Reported by heckflosse@i-weyrich.de on 2014-04-02 16:33:22

Beep6581 commented 9 years ago
I tested it with these results:

GPSVersionID:
uncompressed tiff -> saved broken, value is all zeros, can be fixed by the workaround
in issue 2291
compressed tiff -> missing completely

UserComment edited in RT GUI:
uncompressed tiff -> saved as all zeros
compressed tiff -> saved as all zeros

UserComment written to CR2 file by exiftool before processing it in RT:
uncompressed tiff -> saved OK
compressed tiff -> saved OK

So these bugs seem to be independent.

JFYI, I checked the saved exif with
exiftool -htmlDump <file>

Reported by nadvornik@suse.cz on 2014-04-05 20:39:22

Beep6581 commented 9 years ago

Reported by entertheyoni on 2014-05-22 12:22:03

Beep6581 commented 9 years ago
Possibly related?

In RT 4.0.12.60, if I 'remove' EXIF fields, they still appear when I save the image
to JPEG.

Reported by barryjgould on 2014-08-02 05:40:36

Beep6581 commented 8 years ago

I tested by setting two XMP, two IPTC and two Exif tags in a raw file:

#!/usr/bin/env bash
exiv2 -M'set Xmp.dc.description XMP_description.' "$1"
exiv2 -M'set Xmp.dc.title XMP_title.' "$1"
exiv2 -M'set Iptc.Application2.Program String IPTC_program.' "$1"
exiv2 -M'set Iptc.Application2.Caption String IPTC_caption.' "$1"
exiv2 -M'set Exif.Photo.UserComment Exif_usercomment.' "$1"
exiv2 -M'set Exif.Image.ImageDescription Exif_imagedescription.' "$1"

Then I saved all possible combinations: JPEG, PNG 8-bit and 16-bit, TIFF 8-bit and 16-bit compressed and uncompressed. One round with TunnelMetaData=true, once more with TunnelMetaData=false. 14 saved files and one source raw.

Then I check each file for the tags:

for f in out_*; do
  echo "$f"; exiv2 --print all "$f" 2>/dev/null | egrep "XMP_title|XMP_description|IPTC_program|IPTC_caption|Exif_imagedescription|Exif_usercomment" ; echo;
done

Only uncompressed TIFF retained the metadata!

out_false_jpg.jpg
Exif.Image.ImageDescription                  Ascii      23  Exif_imagedescription.
Exif.Photo.UserComment                       Undefined  25  Exif_usercomment.

out_false_png16.png

out_false_png8.png

out_false_tif16_compressed.tif
Exif.Photo.UserComment                       Undefined  25  Exif_usercomment.

out_false_tif16_uncompressed.tif
Exif.Image.ImageDescription                  Ascii      23  Exif_imagedescription.
Exif.Photo.UserComment                       Undefined  25  Exif_usercomment.

out_false_tif8_compressed.tif
Exif.Photo.UserComment                       Undefined  25  Exif_usercomment.

out_false_tif8_uncompressed.tif
Exif.Image.ImageDescription                  Ascii      23  Exif_imagedescription.
Exif.Photo.UserComment                       Undefined  25  Exif_usercomment.

out_true_jpg.jpg
Exif.Image.ImageDescription                  Ascii      23  Exif_imagedescription.
Exif.Photo.UserComment                       Undefined  25  Exif_usercomment.

out_true_png16.png

out_true_png8.png

out_true_tif16_compressed.tif
Exif.Photo.UserComment                       Undefined  25  Exif_usercomment.

out_true_tif16_uncompressed.tif
Exif.Image.ImageDescription                  Ascii      23  Exif_imagedescription.
Exif.Photo.UserComment                       Undefined  25  Exif_usercomment.
Iptc.Application2.Program                    String     13  IPTC_program.
Iptc.Application2.Caption                    String     13  IPTC_caption.
Xmp.dc.description                           LangAlt     1  lang="x-default" XMP_description.
Xmp.dc.title                                 LangAlt     1  lang="x-default" XMP_title.

out_true_tif8_compressed.tif
Exif.Photo.UserComment                       Undefined  25  Exif_usercomment.

out_true_tif8_uncompressed.tif
Exif.Image.ImageDescription                  Ascii      23  Exif_imagedescription.
Exif.Photo.UserComment                       Undefined  25  Exif_usercomment.
Iptc.Application2.Program                    String     13  IPTC_program.
Iptc.Application2.Caption                    String     13  IPTC_caption.
Xmp.dc.description                           LangAlt     1  lang="x-default" XMP_description.
Xmp.dc.title                                 LangAlt     1  lang="x-default" XMP_title.
heckflosse commented 8 years ago

I'm currently working on exif support for compressed TIFF. Though not so easy with libtiff.....

Hombre57 commented 6 years ago

@Beep6581 @heckflosse @agriggio I managed to write Exif.UserComment into compressed tiff files (I've removed one of the two ways of writing Tiff files btw). However I'm struggling at supporting this tag into RT, because we return std::string instead of Glib::ustring (see here ; we actually support ASCII strings, but I could add UNICODE support as well). So I'll need advice on how to return and use a Glib::ustring here.

Hombre57 commented 6 years ago

... and I'll probably extend this to other string type fields, if possible.

Floessie commented 6 years ago

@Hombre57

However I'm struggling at supporting this tag into RT, because we return std::string instead of Glib::ustring.

Technically that's no problem if you convert UCS-2 to UTF-8 internally as long as you don't use it for some fancy (e.g. case insensitive) comparison. Though having a Glib::ustring return type could be cleaner.

... and I'll probably extend this to other string type fields, if possible.

A common string get-and-decode function would be nice.

Just my 2¢, Flössie

Hombre57 commented 6 years ago

@Floessie The standard for EXIF is to use only ASCII chars excepted for UserComment. However I'm not sure that this is respected, and we may have non ASCII chars in strings. I'd like to look for chars >127 and consider them as UTF-8 at load time, or probably only when filling the GUI). But I don't know if it's reasonable to write them back the same way (i.e. using invalid chars while).

Hombre57 commented 6 years ago

Problem exist for IPTC strings too btw, which can handle several charsets globally. I'd like to handle UTF-8 (already done locally) but then we have a UTF-8 strings in metadata, since IPTC is part of the metadata.

Floessie commented 6 years ago

@Hombre57 Searching for this topic reveals this exiv2 issue. So yes, there are "ASCII" tagged UTF-8 strings out in the wild, but as I understand it, it should be either-or. So your idea about detecting UTF-8 when reading is right, but when writing it should be encoded as UCS-2 if there are characters outside the ASCII plane (if I understand it right, of course).

Beep6581 commented 6 years ago

Perhaps we could use @clanmills advice here.

clanmills commented 6 years ago

I'm reluctant to recommend anything concerning Unicode text strings in Exif metadata. The Exif Specification says almost nothing about Unicode: http://www.cipa.jp/std/documents/e/DC-008-2012_E.pdf

Although it's possible to encode and write tags in different Unicode formats (UTF-8, UCS-2 etc), you can't assume they will interoperate with other applications.

I feel this is a matter that should be addressed by a standards body.

Beep6581 commented 6 years ago

While testing #3352 I found something I thought I'd mention. Ping @Hombre57 I appended the Swedish characters åöä to every field in the IPTC tab and to an Exif ImageDescription tag. Here are the results after saving:

-PNG-TextualData-EXIF_Profile-IFD0:ImageDescription=Exif ImageDescription åöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:ObjectName=title åöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:SupplementalCategories=, category åöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:Keywords=keywords åöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:SpecialInstructions=instructions åöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:DateCreated=cråöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:By-line=Morgan Hardwood åöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:By-lineTitle=creator's job title åöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:City=city åöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:Province-State=province åöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:Country-PrimaryLocationName=country åöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:OriginalTransmissionReference=jobidåöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:Headline=headline åöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:Credit=credit line åöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:Source=Morgan åöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:CopyrightNotice=Morgan, www.londonlight.org åöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:Caption-Abstract=description på svenska åöä
-PNG-TextualData-EXIF_Profile-IFD0-IPTC:Writer-Editor=description writer åöä

DateCreated was the only IPTC tag to show it correctly, I guess it stores the contents differently than the other IPTC tags.

Hombre57 commented 6 years ago

@Beep6581

Although it's possible to encode and write tags in different Unicode formats (UTF-8, UCS-2 etc), you can't assume they will interoperate with other applications.

I agree with this, so should we let the user specify the string format in Preferences ? The specs says that it could use UCS-2 at most, which is quite limitative. Some software may be able to read the Byte Order Marks, which mean that we potentially could export to UTF-8, UCS-2LE or UCS-2BE. The default behavior would still to export to UCS-2 w/o BOM, in the Metadata's byte order. However converting a Glib::ustring to an UCS-2 string may not work flawlessly in all case, I suppose.

Beep6581 commented 6 years ago

Exif tag encoding is unstandardized wild west territory. As such, RawTherapee should support the most permissive encoding correctly handled by our environment - libexiv2, ExifTool, GIMP, digiKam, darktable, etc.

@Hombre57 is it not better to leave the encoding at ASCII or whatever it is now, and put the effort into #3801 instead?

Hombre57 commented 6 years ago

@Beep6581 I agree that using exiv2 or exiftool would be better on a long term view (we still haven't chosen one solution btw), however it won't be for tomorrow (unless agrigigo does it 😄 ). But the way that this loosy exif parameter is handled in GUI (text input + display in metadata tab) vs how it's encoded in the file still remains. Does exiv2 provide an interface so that we can get/set an UTF-8 string here ?

I'd like to spend 2-3 days more trying to solve this. If unsuccessful, I'll drop the patch.

Beep6581 commented 6 years ago

@Hombre57 regarding the choice, lensfun uses lens names from exiv2 as does dt, so we'd have fewer compatibility issues if we used it too.

Does exiv2 provide an interface so that we can get/set an UTF-8 string here ?

I don't know.

clanmills commented 6 years ago

Exiv2 implements the Exif Standard: http://www.cipa.jp/std/documents/e/DC-008-2012_E.pdf

All data is defined by a tag (integer), and a homogeneous array of data (p25). There are 8 types of data:

1 = BYTE 2 = ASCII
3 = SHORT 4 = LONG
5 = RATIONAL 7 = UNDEFINED
9 = SLONG  10 = SRATIONAL

All data is a homogeneous ARRAY and therefore has a 2 byte integer count. You can display the values in an image with the option -pv (print values):

602 rmills@rmillsmbp:~ $ curl -O --silent http://clanmills.com/Stonehenge.jpg
603 rmills@rmillsmbp:~ $ exiv2 -pv --grep date/i ~/Stonehenge.jpg
0x0132 Image        DateTime                    Ascii      20  2015:07:16 20:25:28
0x9003 Photo        DateTimeOriginal            Ascii      20  2015:07:16 15:38:54
0x9004 Photo        DateTimeDigitized           Ascii      20  2015:07:16 15:38:54
0x0003 NikonWt      DateDisplayFormat           Byte        1  0
0x001d GPSInfo      GPSDateStamp                Ascii      11  2015:07:16
604 rmills@rmillsmbp:~/clanmills $ 

As you can see, Image.DateTime is an Ascii tag (0x0132) of 20 bytes as documented on page 83. The count of 20 includes the ascii nul trailer.

I believe there are two differences between BYTE and ASCII. BYTE is binary (all bytes 0-255 are valid). BYTE is not required to have an ascii nul trailer.

Although Exiv2 implements the standard, it doesn't enforce it. It's not a policeman. You can change the value to be a different type. And you can store binary bytes (128-255) in a ASCII field.

If I were encoding "unicode", I would use UTF-8 without a BOM marker. However, you cannot assume that other applications will understand this. Why should they understand something that violates the spec? It's probably going to work "most of the time" - especially on *nix platforms and probably harmless on Windows. None the less, this is a violation of the standard.

I don't think the Exiv2 library will prevent you from using a multi-byte encoding with BOM markers in an ASCII tag, however it's unlikely to be understood by other applications. I feel unicode support has to be discussed with the managers of the Exif standard. For any application (or library) to say "We're going support unicode in ASCII tags using UCS-2 with BOM markers" feels to me like a very incorrect violation of the standard.

Another matter to consider is how to test this. Just because it works in darktable (which uses Exiv2) or Photoshop (which does not use Exiv2), is a minimum test. There are hundreds of common image manipulation applications in everyday use. How can you know that if a "unicode" image modified by RT is compatible with somebody's favourite tool?

Hombre57 commented 6 years ago

@clanmills @Floessie @Beep6581 I think there's a misunderstanding here. I was referring to the Exif.UserComment tag (possibly not clearly enough), and understand now that @Floessie was talking about all ASCII type tags. To clear things up, I'm only working on Exif.UserComment. Given that UCS-2 (2 byte encoding) is a subset of UTF-16 and can't represent all the charset, am I right to assume that some software could break that rule and use UTF-8 here ? Connected question: should we allow to users to use UTF-8 for Exif.UserComment if edited in RT (otherwise we'll leave the value as is) ? I think so.

Beep6581 commented 6 years ago

Related, ctrl+f for "charset" here: http://www.exiv2.org/manpage.html

The format of Exif Comment values includes an optional charset specification at the beginning: [charset=Ascii|Jis|Unicode|Undefined ]comment

Undefined is used by default if the value doesn't start with a charset definition.

exiv2 -M"set Exif.Photo.UserComment charset=Ascii New Exif comment" image.jpg

clanmills commented 6 years ago

As yes, there is a misunderstanding. Remarkably, in 10 years of working on Exiv2, I think this is the first time that I've ever got into a discussion about the -n (--encode) feature. It appears to be writing an "UNDEFINED" array of bytes (type 0). I think the first 8 bytes The first 8 bytes declare the encoding.

For sure, I'd want to study the code concerning this feature in more detail before passing judgment about compatibility with other applications. However the feature has been there for more than 10 years without mention, so its safety has stood the test of time.

$ exiv2 -pR Stonehenge.jpg | grep -i Comment
         428 | 0x9286 UserComment                  | UNDEFINED |       44 |       820 | ASCII...                         ...
$ exiv2 -M"set Exif.Photo.UserComment a" --encode UTF-16 Stonehenge.jpg 
$ exiv2 -pR Stonehenge.jpg | grep -i Comment
         428 | 0x9286 UserComment                  | UNDEFINED |        9 |       820 | ........a
$ exiv2 -M"set Exif.Photo.UserComment b" --encode UTF-8 Stonehenge.jpg 
$ exiv2 -pR Stonehenge.jpg | grep -i Comment
         428 | 0x9286 UserComment                  | UNDEFINED |        9 |       820 | ........b
$ 
Hombre57 commented 6 years ago

So here are the result of my small test :

Case File's Exif byte order Platform byte order Using BOM UCS-2 String byte order Read by IrfanView Read by Windows's "Properties"
1 Big endian Little endian No Little endian (violate the rules) Correctly Correctly
2 Big endian Little endian No Big endian (comply the rules) Correctly Correctly
3 Big endian Little endian Yes Big endian (comply the rules) BOM appears in string Correctly

I'll add BOM detection and auto-detect byte order if no BOM set, depending on the position of the null byte for each chars.

The files can be found here.

Hombre57 commented 6 years ago

My patch has introduced a bug : IPTC is not written (correctly) to the output file. I'm trying to fix this.

Hombre57 commented 6 years ago

Correction : it works fine for tiff and jpg if you don't forget to set Metadata copy mode to Apply changes 🙄

Now I have to find out why it doesn't work for png.

Hombre57 commented 6 years ago

@Beep6581 @agriggio Could you test this branch for PNG files please ? Exiftool find the IPTC and Exif.UserComment fields, but IrfanView, Windows and RT don't. So I don't really understand what's going on.

That thing apart, this branch should be tested for a broad range of use case. Can I reopen or put a call for test in the closed issues related to IPTC and UserComment encoding ?

I'd like to merge in 5 days if no answer.

agriggio commented 6 years ago

@Hombre57 I'll try. my understanding (from quick searches, so might be completely off) is that there's no real standard for storing metadata in PNGs, only accepted conventions. I copied the code from dt for exif, and used the same for iptc, but without really checking that the iptc part was working I admit...

clanmills commented 6 years ago

Gentlemen:

I'm not sure what you are discussing in DT and RT. Here's some background information about PNG metadata. I hope this is useful and doesn't confuse you!

Here's the document that Tuan wrote about this: http://dev.exiv2.org/projects/exiv2/wiki/The_Metadata_in_PNG_files

Tuan was a GSoC student working with me in 2013 and asked the question "Where is the metadata in the file?" to which my answer at that time was "I have no idea!". So, being a smart and hard-working young man, Tuan did some research and documented PNG and JPEG. He also added the incredibly useful command option -pS (print structure) to the exiv2 command-line program. I've built on his work by documenting TIFF and adding option -pR (print structure recursively).

In the last few months, a couple of users have discussed another PNG metadata format and have started a project to support PNG iXTt/zTXt metadata http://dev.exiv2.org/issues/1312

Incidentally, Tuan got married in Ho Chi Minh City in November and Alison and I attended the wedding.
dsc_6944

agriggio commented 6 years ago

@clanmills Robin, thanks for the link! As I wrote above, I don't know much about metadata, and was just referring to the following comment in darktable/src/imageio/format/png.c:

/* Write EXIF data to PNG file.
 * Code copied from DigiKam's libs/dimg/loaders/pngloader.cpp.
 * The EXIF embedding is defined by ImageMagicK.
 * It is documented in the ExifTool page:
 * http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/PNG.html
 *
 * ..and in turn copied from ufraw. thanks to udi and colleagues
 * for making useful code much more readable and discoverable ;)
 */

Now I've just started reading the wiki page you mentioned above, and it seems consistent with my understanding, in that it says:

There are no standard for Exif, IPTC data. In Exiv2, when Exif, IPTC are added, they are stored in zTXt text chunks and save as ASCII.

Which is what the RT code (taken from Darktable) does -- or that's the intention, at least :-)

Hombre57 commented 6 years ago

For what is worth, here is the Wikipedia page for PNG, and one of its source.

So it looks like that PNG 1.5 specs define an eXIf ancillary chunk. I don't know when this revision has been created though.

I was worried about the fact that RT was able to read the Exif/IPTC tags, but it's not the case anymore with this branch. Could you confirm @Beep6581 ?

@clanmills You all look very pretty and happy ;)

Hombre57 commented 6 years ago

1.5.0 is the libpng version that extends the V1.2 official spec, sorry for the confusion.

clanmills commented 6 years ago

Everything in PNG is in a "chunk".

There are at least two PNG metadata "conventions". Exif metadata is written as a little TIFF file and embedded in a "chunk". IPTC is written as a IPTC data block and written as a chunk. Exiv2 supports both of those (and XMP and ICC profiles).

There's another PNG format (which I believe is used by Image Magick) in which key/value pairs are written as chunks. There are about 10 common keys for "Title", "Copyright" etc: https://github.com/Exiv2/exiv2/issues/147#issuecomment-346161992

I don't know what "UserComment" is! It could be the Exif tag "Exif.Photo.UserComment", or IPTC "Iptc.Application2.Caption", or IPTC "Iptc.Application2.ObjectName", or "Title" in PNG in "11.3.4.2 Keywords and text strings" here: http://www.libpng.org/pub/png/spec/iso/index-object.html

We're dealing here with "russian dolls".
code: Exiv2/digiKam/uraw/RT/dt standards: Exif/IPTC/PNG encoding: Unicode/ascii

So many balls in the air!

There was a lengthy discussion 3+ years ago about how "Title" is to be stored in an image? Conclusion: Different Software Applications use different conventions. http://dev.exiv2.org/issues/985 Adobe applications consistently use Iptc.Application2.ObjectName.

I think the RT aim is to submit something as quickly as possible and move on. I suspect metadata is on the periphery of most image processing applications. The project manager says "just put the title somewhere and let's get on with ...". Perhaps that's best. Perhaps that's the only thing that can be done.

Hombre57 commented 6 years ago

@clanmills So in one word : a mess 🙄 Supporting XMP (through exiv2) is the only way to go anyway, for professional at least.

The job is done for TIFF and JPG, if everyone is fine with how it's done in PNG now, let's test TIFF and JPG metadata a little bite an merge asap, I have other things to do too.

clanmills commented 6 years ago

Good summary - an application mess! Let's test and move on.