Exiv2 / exiv2

Image metadata library and tools
http://www.exiv2.org/
Other
912 stars 280 forks source link

Error: Directory Samsung2 with 21313 entries considered invalid; not read. #1524

Open jiemdev opened 3 years ago

jiemdev commented 3 years ago

Hello,

When I read exifs from DNGs from a Samsung GX20 camera, I get following warning: "Error: Directory Samsung2 with 21313 entries considered invalid; not read." then very few tags are printed (only standard exif tags).

The problem is: I can not read any Exif.Samsung2.xxxxx tags with exiv2 (for example the lens type contained in Exif.Samsung2.LensType, and this tag is needed to apply automatic lens correction with some software.) The strange thing is that exiftool is able top print data from Samsung2 tags.

Is it possible to correct this bug please? I may help if needed, but have no idea of which file to edit.

Thanks

clanmills commented 3 years ago

@jiemdev Thank You for your report.

Lots of work has been completed in the last few month on the 0.27-maintenance branch concerning DNG support. This will ship on 2021-04-30 as Exiv2 v0.27.4. Can you build and test that branch with your files.

Can you attach a file from your Samsung GX20 and I will investigate. It would help if you identify relevant tags successfully reported by ExifTool. Perhaps you can look around the ExifTool website and find documentation that relates to your tags of interest.

jiemdev commented 3 years ago

Hi, sure I send you a file: SG204732.DNG.zip

I just tried with v0.27.4-RC1 and I get the same error.

For example, an important data is the lens information. It does not appear with exiv2 (even with -p a), whereas exiftool -g is able to detect: " ---- MakerNotes ---- [...] Lens Type : Tamron AF 17-50mm F2.8 XR Di-II LD (Model A16) [...] ---- Composite ---- [...] Lens ID : Tamron AF 17-50mm F2.8 XR Di-II LD (Model A16) "

Note, a strange thing is, while being a Samsung body, that there are Pentax tags (GX20 body is a cooperation between both brands, the same DSLR exists as Pentax K20D): " ---- MakerNotes ---- Pentax Version : 4.3.0.0 Pentax Model Type : 0 "

Hope it helps, I don't know well all of this.

clanmills commented 3 years ago

Thanks. This will have to wait for attention. I will quickly look this today, however I doubt if very much can be done immediately. We're focused on Exiv2 v0.27.4 at the moment and hoping to release RC2 today. We're aiming for GM on 2021-04-30 (ahead of the schedule which is 2021-05-22).

clanmills commented 3 years ago

I've looked at this and have made progress.

I've encountered RICOH files which use Pentax MakerNotes. SAMSUNG must be doing something similar.

I am writing a book "Image Metadata and Exiv2 Architecture". https://clanmills.com/exiv2/book/ The book's code is called tvisitor.cpp and is a simplified version of Exiv2. tvisitor reads your file without reporting the mysterious message Error: Directory Samsung2 with 21313 entries considered invalid; not read..

I've looked at the excellent ExifTool documentation: https://exiftool.org/TagNames/Pentax.html#LensType

The lens tag is: 0x003f | LensRec | - | --> Pentax LensRec Tags

And is successfully read by tvisitor as:

626 rmills@rmillsm1:~/gnu/exiv2/team/book/build $ tvisitor -pRU ~/Downloads/SG204732.DNG | grep LensRec
      1282 | 0x003f Exif.Pentax.LensRec              |     UBYTE |        3 |           | 7 230 0
627 rmills@rmillsm1:~/gnu/exiv2/team/book/build $

And (according ExifTool's documentation):

'7 230' | = Tamron AF 17-50mm F2.8 XR Di-II LD (Model A16)

Regretfully, I don't have time at the moment to dig into the Exiv2 code involved. Exiv2 v1.00 is scheduled for 2021-12-15. I hope the code will be modified appropriately in the next few months.

If you believe your C++ skills are strong, I am happy to mentor you to solve this: after v0.27.4 has shipped.

jiemdev commented 3 years ago

Hello,

I have some knowledge in C++, I can fork the project and try. Will wait for v0.27.4

clanmills commented 3 years ago

Thanks for getting back to me. This is not an easy change. There is support for Samsung branded Pentax cameras in the Exiv2 code-base. The existing code expects the makernote (called DNGPrivateData in DNG) to begin with the string "AOC\0". In your file, that tag is:

$ exiv2 -g DNGPrivate ~/Downloads/SG204732.DNG
Exif.Image.DNGPrivateData     Byte 102400  83 65 77 83 85 78 71 0 77 77 0 99 0 0 0 1 0 0 0 4 4 3 0 0 0 1 0 3 0 0 0 1 0 0 0 0 0 2 0 3 0 0 0 2 
                                            S  A  M  S  U  N  G\0  M  M \0 *  <long><tag><##><long><long>

I didn't write Exiv2. So there are gaps in my knowledge of the code. I wrote tvisitor.cpp for several reasons:

  1. To understand how exactly Exiv2 works.
  2. To illustrate the book with a one-file 4000 line program that decodes metadata.
  3. tvisitor handles some metadata that Exiv2 does not support. For example: bmff, BigTiff, PNG text, JPEG>64k.

The tvisitor bmff code was ported to Exiv2 v0.27.4 and is the Primary Feature of v0.27.4.

There are mysterious internal classes involved with the Samsung/Pentax puzzle:

src/makernote_int.hpp:    class PentaxDngMnHeader : public MnHeader {
src/makernote_int.hpp:    }; // class PentaxDngMnHeader
src/makernote_int.hpp:    class PentaxMnHeader : public MnHeader {
src/makernote_int.hpp:    }; // class PentaxMnHeader
src/pentaxmn_int.hpp:    class PentaxMakerNote {
src/pentaxmn_int.hpp:    }; // class PentaxMakerNote

The "AOC\0" is a const std::string signature_ in that complex. It's likely that you only need to redefine it as "SAMSUNG\0" and your file will be read.

If that works you'll have to:

  1. Restore the existing code which is there for a good reason.
  2. Implement new classes such as class PentaxDng2MnHeader with the new signature.
  3. Add additional tests to the test suite.

Not a trivial task. That's why I can't do it at the moment.

clanmills commented 3 years ago

My diagnosis is 100% correct. I have modified the "AOC\0" header to be "SAMSUNG\0" with almost complete success:

523 rmills@rmillsm1:~/gnu/github/exiv2/0.27-maintenance/build $ exiv2 -pa ~/Downloads/SG204732.DNG  | grep -i lens
Exif.Pentax.LensType                         Byte        3  Tamron AF 17-50mm F2.8 XR Di-II LD (Model A16)
Exif.Pentax.LensInfo                         Undefined  45  12 32 0 61 0 3 0 0 0 1 32 0 0 0 0 62 0 1 0 0 0 4 28 28 0 0 0 63 0 1 0 0 0 3 7 230 0 0 0 71 0 6 0 0 0
524 rmills@rmillsm1:~/gnu/github/exiv2/0.27-maintenance/build $

The changes are modest:

diff --git a/src/makernote_int.cpp b/src/makernote_int.cpp
index 4db41f46..4d2cd543 100644
--- a/src/makernote_int.cpp
+++ b/src/makernote_int.cpp
@@ -618,7 +618,7 @@ namespace Exiv2 {
     } // PentaxDngMnHeader::write

     const byte PentaxMnHeader::signature_[] = {
-        'A', 'O', 'C', 0x00, 'M', 'M'
+        'S', 'A', 'M', 'S', 'U', 'N', 'G' , 0x00, 'M', 'M'
     };

     uint32_t PentaxMnHeader::sizeOfSignature()
@@ -990,7 +990,7 @@ namespace Exiv2 {
             if (size < PentaxDngMnHeader::sizeOfSignature() + 18)
                 return 0;
             return newPentaxDngMn2(tag, group, (tag == 0xc634 ? pentaxDngId:pentaxId));
-        } else if (size > 4 && std::string(reinterpret_cast<const char*>(pData), 4) == std::string("AOC\0", 4)) {
+        } else if (size > 8 && std::string(reinterpret_cast<const char*>(pData), 8) == std::string("SAMSUNG\0", 8)) {
             // Require at least the header and an IFD with 1 entry
             if (size < PentaxMnHeader::sizeOfSignature() + 18)
                 return 0;
@@ -1020,8 +1020,8 @@ namespace Exiv2 {
                                 uint32_t    size,
                                 ByteOrder   /*byteOrder*/)
     {
-        if (   size > 4
-            && std::string(reinterpret_cast<const char*>(pData), 4) == std::string("AOC\0", 4)) {
+        if (   size > 8
+            && std::string(reinterpret_cast<const char*>(pData), 8) == std::string("SAMSUNG\0", 8)) {
             // Samsung branded Pentax camera:
             // Require at least the header and an IFD with 1 entry
             if (size < PentaxMnHeader::sizeOfSignature() + 18) return 0;

What remains to be done:

  1. Changing this code causes the test harness to fail because we've "lost" support for the "AOC" Samsung/Pentax lenses.
  2. We must restore the existing code and add new classes as I explained earlier.
  3. Add new tests for your file.
  4. We could modify the "AOC" code to flexibly deal with either "AOC" or "SAMSUNG". This will be quite easy. However, it might modify the API and we never do that on the 0.27-maintenance branch. We can add new classes, however we shouldn't change any existing class that is visible in the API.

We don't need to preserve the API for v1.00, although it is desirable to do so when possible.

I hope you, or a member of Team Exiv2 will work on this for v1.00.

jiemdev commented 3 years ago

It looks great, thanks. I have a question: why copying/changing from PentaxMnHeader (AOC\0) and not from PentaxDngMnHeader (PENTAX \0) and DNG like the Samsung file ?

clanmills commented 3 years ago

I don't understand your question.

jiemdev commented 3 years ago

For Pentax, there is the class PentaxMnHeader and the class PentaxDngMnHeader. I guess it depends on the raw format of the file? Because Pentax is able to output DNG or PEF. Samsung DSLR counterpart like GX-20 (Pentax K20D clone) only outputs DNG format. Or maybe I am totally wrong.

jiemdev commented 3 years ago

Maybe it is a coincidence but we can see that the PentaxDngMnHeader::signature_ contains an additional space at the end: "PENTAX \0". Strange, isn't it? That way it is 8 chars long. That matches the length of samsung header "SAMSUNG\0".

clanmills commented 3 years ago

I can't speculate about how/why the camera manufacturer does anything. I am seldom impressed by firmware. I have to deal with the consequences of random and irrational behaviour. For example, Canon have three totally different RAW formats (.CRW, .CR2 and .CR3).

I will however offer my respect for the hardware of the cameras. They are usually very robust devices and at the low end, very good value.

nhelkenn commented 9 months ago

This is an old thread, but I have been advised to mention my issue here. I have RAW files from my Samsung camera that are not being properly processed. The error log for DigiKam is reporting issues with Exiv2. I am hoping there is more updated information available on the solution.

Thank you, Nadine H

nhelkenn commented 6 months ago

It has been over 2 months & I see no response on the issue. I am hoping that is because we were all super busy over the holidays of the season. Perhaps someone will be able to address a resolution to my issue? Thank you very much.