Exiv2 / exiv2

Image metadata library and tools
http://www.exiv2.org/
Other
934 stars 281 forks source link

Unable to read Exif.Photo.ColorSpace from images #129

Closed henryborchers closed 7 years ago

henryborchers commented 7 years ago

I'm trying to build a Python binding for Exiv2 to help me validate a gigantic numbers of images files for a digital archive. So far building a C++ to Python wrapper has been pretty easy because of the good documentation. Thanks on that!

However, I've not been able to get a value for ColorSpace to check if an image is sRGB or Adobe RGB. I looked at http://www.exiv2.org/tags.html and checked in the Exif.Photo.ColorSpace metadata field many different files and file types but got no results.

Because I need CMake support to build my Python extension for Windows and *nix systems, I've been compiling from the head. I'm wondering if I'm missing a compiler flag (I'm currently not building it with XMP).

Any idea why I'm unable to read the ColorSpace value?

clanmills commented 7 years ago

I can read the colourspace on my favourite test file:

$ exiv2 -pa --grep colorspace/i http://clanmills.com/Stonehenge.jpg
Exif.Nikon3.ColorSpace                       Short       1  sRGB
Exif.Photo.ColorSpace                        Short       1  sRGB

The most obvious reason that you cannot read the colourspace, is because it's not stored! Can you attach a test file for me to examine?

There is a python wrapper for Exiv2 pyexiv2. It's no longer maintained because gexiv2 makes it obsolete. I've used pyexiv2 a lot in years gone by. I've never looked at gexiv2.

henryborchers commented 7 years ago

Thanks for getting back to me.
I tried your command as well and I also got your same result. So it does work. However, it doesn't seem to do it with my files.

I can send you some sample images but because we are an archive, the files are actually quite large. 134 mb for a tif. What's the best way I can share it?

Btw, I looked into to gexiv2 and pyexiv2 before I started on rolling my own but neither works on Windows. Unfortunately, almost all the workstations here are Windows. Only a handful of servers here are Linux. :(

edit: typos

henryborchers commented 7 years ago

oops didn't mean to close this

clanmills commented 7 years ago

Henry: Dropbox is a good way to share such a large file. It's also worth see if the file can compress to a smaller size. And there are "one off" file transfer web sites on which you can store the file and it'll send me a email to collect it: robin@clanmills.com

You can build pyexiv2 for Windows. In fact, that's how I started working on Exiv2 about 10 years ago. I haven't built pyexiv2 on Windows since 2011, however I believe it will build OK with Exiv2 v0.26. http://clanmills.com/articles/gpsexiftags/windows.shtml

There are built dlls (for exiv2 v0.22) here: http://clanmills.com/articles/gpsexiftags/default-2011.shtml And while that's a rather elderly version of Exiv2, it may be sufficient for you to read metadata and to decide if you'd like to make the effort to build pyexiv2 with exiv2 0.26.

henryborchers commented 7 years ago

oh now I remember. pyexiv2 doesn't work with Python 3+ and py3exiv2 is linux only.

Anyways. I sent you a email with a link to a couple collection using our Box account. Thank you for taking a look.

clanmills commented 7 years ago

Henry: Thanks for the files. The ICC profile in 6895567/*.tif do not have the tag Exif.Photo.ColorSpace. However they do contain an ICC profile in Exif.Image.InterColorProfile

553 rmills@rmillsmbp:~/Downloads/test images/6895567 $ exiv2 -eC- 00000001.tif > /tmp/iccProfile ; iccDumpProfile /tmp/iccProfile
Profile:          '/tmp/iccProfile'
Profile ID:       Profile ID not calculated.
Size:             560(0x230) bytes

Header
------
Attributes:       Reflective | Glossy
Cmm:              Adobe
Creation Date:    8/11/2000  19:51:59
Creator:          'ADBE' = 41444245
Data Color Space: RgbData
Flags             EmbeddedProfileFalse | UseAnywhere
PCS Color Space:  XYZData
Platform:         Macintosh
Rendering Intent: Perceptual
Type:             DisplayClass
Version:          2.10
Illuminant:       X=0.9642, Y=1.0000, Z=0.8249

Profile Tags
------------
                      Tag    ID      Offset     Size
                     ----  ------    ------     ----
             copyrightTag  'cprt'       252       50
    profileDescriptionTag  'desc'       304      107
       mediaWhitePointTag  'wtpt'       412       20
       mediaBlackPointTag  'bkpt'       432       20
                redTRCTag  'rTRC'       452       14
              greenTRCTag  'gTRC'       468       14
               blueTRCTag  'bTRC'       484       14
           redColorantTag  'rXYZ'       500       20
         greenColorantTag  'gXYZ'       520       20
          blueColorantTag  'bXYZ'       540       20
554 rmills@rmillsmbp:~/Downloads/test images/6895567 $

The API image::iccProfile() was added in Exiv2 v0.26. http://www.exiv2.org/doc/classExiv2_1_1Image.html

Your jp2 files in 2693684/*.jp2 have illegal XMP embedded. However, they do contain a ICC profile.

564 rmills@rmillsmbp:~/Downloads/test images/2693684 $ exiv2 -pR 00000056.jp2 
STRUCTURE OF JPEG2000 FILE: 00000056.jp2
 address |   length | box       | data
       0 |       12 | jP        | 
      12 |       20 | ftyp      | 
      32 |     7328 | jp2h      | 
      40 |       22 |  sub:ihdr | .............
      62 |     7272 |  sub:colr | ......]Lino....mntrRGB XYZ .. | pad: 2 0 0 | iccLength:7261
    7334 |       26 |  sub:res  | ....resd.(...(....
    7360 |    16970 | uuid      | XMP : <?xpacket begin="..." id="W5M0MpCehiHzre
   24330 |        0 | jp2c      | 
565 rmills@rmillsmbp:~/Downloads/test images/2693684 $ exiv2 -eCe 00000056.jp2 > /tmp/iccProfile ; iccDumpProfile /tmp/iccProfile 
Error: XMP Toolkit error 201: XML parsing failure
Warning: Failed to decode XMP metadata.
Unable to open '/tmp/iccProfile'
566 rmills@rmillsmbp:~/Downloads/test images/2693684 $

It's possible (very likely) that Exiv2 doesn't have support for ICC profiles in .jp2 files. I implemented image::iccProfile() for JPG/PNG/TIFF. I don't remember JP2 (probably because I didn't have a sample file). If this is important to you, I'm confident about adding this to Exiv2 when I get home from vacation in early November. If this is urgent, another member of Team Exiv2 might accept the challenge to add this feature.

I think the issue of the XMP parsing error in the JP2 files is a bug in the jp2image.cpp. With the benefit of your test files, I'm confident that can be easily fixed.

clanmills commented 7 years ago

Henry: Good News. The family in Houston have taken Alison to an Art Gallery for the afternoon and I've had a little time on my own to catch up on some Exiv2 correspondance.

I've discovered two interesting matters: 1) The XMP in your .jp2 files isn't valid XML. I think you'll have to discuss with the folks generating your images. Exiv2 does have code to replace XMP/XML - however we don't have code to fix broken XML. 2) I do have ICC code in jp2image.cpp, however there are a couple of bugs that I have corrected.

a) You can now extract the XMP/XML from the .jp2

571 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2 $ build/bin/exiv2 -pX ~/Downloads/Urbana/1.jp2 
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 5.1.2">
 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <rdf:Description rdf:about=""
    xmlns:xmp="http://ns.adobe.com/xap/1.0/"
    xmlns:aux="http://ns.adobe.com/exif/1.0/aux/"
    xmlns:photoshop="http://ns.adobe.com/photoshop/1.0/"

   <photoshop:ColorMode>3</photoshop:ColorMode>
   <photoshop:ICCProfile>sRGB IEC61966-2.1</photoshop:ICCProfile>    xmlns:Iptc4xmpCore="http://iptc.org/std/Iptc4xmpCore/1.0/xmlns/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
...
...

b) You can extract the ICC profile:

572 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2 $ build/bin/exiv2 -pC ~/Downloads/Urbana/1.jp2 > 1.icc ; iccDumpProfile 1.icc 
Profile:          '1.icc'
Profile ID:       Profile ID not calculated.
Size:             7261(0x1c5d) bytes

Header
------
Attributes:       Reflective | Glossy
Cmm:              Unknown 'Lino' = 4C696E6F
Creation Date:    2/9/1998  06:49:00
Creator:          'HP  ' = 48502020
Data Color Space: RgbData
Flags             EmbeddedProfileFalse | UseAnywhere
...
...
henryborchers commented 7 years ago

Thank you!!! Sadly, got dragged into other work today. I'll check it first thing Monday.

clanmills commented 7 years ago

Henry

Don’t worry about having other stuff to do. I’m pleased that we’re making progress.

You may have to patch the code manually (revision below). Exiv2 moved to GitHub when we released v0.26 in April. I still don’t understand git and GitHub demanded a reviewed PR. I’ll ask one of the team to submit the change for me. I’m pleased to say that the other members of Team Exiv2 are all strong on git (and many many other matters).

So relax. Have a nice weekend. I expect we’ll talk next week when I’ll be in California to visit my old buddies in Silicon Valley. (I retired 3 years ago and now live in England).

Robin

Here’s the modified function that you need in src/jp2image.cpp

void Jp2Image::printStructure(std::ostream& out, PrintStructureOption option,int depth)
{
    if (io_->open() != 0) throw Error(9, io_->path(), strError());

    // Ensure that this is the correct image type
    if (!isJp2Type(*io_, false)) {
        if (io_->error() || io_->eof()) throw Error(14);
        throw Error(15);
    }

    bool bPrint     = option == kpsBasic || option==kpsRecursive;
    bool bRecursive = option == kpsRecursive;
    bool bICC       = option == kpsIccProfile;
    bool bXMP       = option == kpsXMP;
    bool bIPTCErase = option == kpsIptcErase;

    if ( bPrint ) {
        out << "STRUCTURE OF JPEG2000 FILE: " << io_->path() << std::endl;
        out << " address |   length | box       | data" << std::endl ;
    }

    if ( bPrint || bXMP || bICC || bIPTCErase ) {

        long              position  = 0;
        Jp2BoxHeader      box       = {1,1};
        Jp2BoxHeader      subBox    = {1,1};
        Jp2UuidBox        uuid      = {{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}};
        bool              bLF       = false;

        while (box.length && box.type != kJp2BoxTypeClose && io_->read((byte*)&box, sizeof(box)) == sizeof(box))
        {
            position   = io_->tell();
            box.length = getLong((byte*)&box.length, bigEndian);
            box.type   = getLong((byte*)&box.type, bigEndian);

            if ( bPrint ) {
                out << Internal::stringFormat("%8ld | %8ld | ",position-sizeof(box),box.length) << toAscii(box.type) << "      | " ;
                bLF = true ;
                if ( box.type == kJp2BoxTypeClose ) lf(out,bLF);
            }
            if ( box.type == kJp2BoxTypeClose ) break;

            switch(box.type)
            {
                case kJp2BoxTypeJp2Header:
                {
                    lf(out,bLF);

                    while (io_->read((byte*)&subBox, sizeof(subBox)) == sizeof(subBox)
                           && io_->tell() < position + (long) box.length) // don't read beyond the box!
                    {
                        int address = io_->tell() - sizeof(subBox);
                        subBox.length = getLong((byte*)&subBox.length, bigEndian);
                        subBox.type   = getLong((byte*)&subBox.type, bigEndian);

                        DataBuf data(subBox.length-sizeof(box));
                        io_->read(data.pData_,data.size_);
                        if ( bPrint ) {
                            out << Internal::stringFormat("%8ld | %8ld |  sub:",address,subBox.length) << toAscii(subBox.type)
                                <<" | " << Internal::binaryToString(data,30,0);
                            bLF = true;
                        }

                        if(subBox.type == kJp2BoxTypeColorHeader)
                        {
                            long pad = 3 ; // don't know why there are 3 padding bytes
                            if ( bPrint ) {
                                out << " | pad:" ;
                                for ( int i = 0 ; i < 3 ; i++ ) out<< " " << (int) data.pData_[i];
                            }
                            long    iccLength = getULong(data.pData_+pad, bigEndian);
                            if ( bPrint ) {
                                out << " | iccLength:" << iccLength ;
                            }
                            if ( bICC ) out.write((const char*)data.pData_+pad,iccLength);
                        }
                        lf(out,bLF);
                    }
                } break;

                case kJp2BoxTypeUuid:
                {

                    if (io_->read((byte*)&uuid, sizeof(uuid)) == sizeof(uuid))
                    {
                        bool    bIsExif = memcmp(uuid.uuid, kJp2UuidExif, sizeof(uuid))==0;
                        bool    bIsIPTC = memcmp(uuid.uuid, kJp2UuidIptc, sizeof(uuid))==0;
                        bool    bIsXMP  = memcmp(uuid.uuid, kJp2UuidXmp , sizeof(uuid))==0;

                        bool    bUnknown= ! (bIsExif || bIsIPTC || bIsXMP);

                        if ( bPrint ) {
                            if ( bIsExif ) out << "Exif: " ;
                            if ( bIsIPTC ) out << "IPTC: " ;
                            if ( bIsXMP  ) out << "XMP : " ;
                            if ( bUnknown) out << "????: " ;
                        }

                        DataBuf rawData;
                        rawData.alloc(box.length-sizeof(uuid)-sizeof(box));
                        long    bufRead = io_->read(rawData.pData_, rawData.size_);
                        if (io_->error()) throw Error(14);
                        if (bufRead != rawData.size_) throw Error(20);

                        if ( bPrint ){
                            out << Internal::binaryToString(rawData,40,0);
                            out.flush();
                        }
                        lf(out,bLF);

                        if(bIsExif && bRecursive && rawData.size_ > 0)
                        {
                            if ( (rawData.pData_[0]      == rawData.pData_[1])
                                &&   (rawData.pData_[0]=='I' || rawData.pData_[0]=='M' )
                                ) {
                                BasicIo::AutoPtr p = BasicIo::AutoPtr(new MemIo(rawData.pData_,rawData.size_));
                                printTiffStructure(*p,out,option,depth);
                            }
                        }

                        if(bIsIPTC && bRecursive)
                        {
                            IptcData::printStructure(out,rawData.pData_,rawData.size_,depth);
                        }

                        if( bIsXMP && bXMP )
                        {
                            out.write((const char*)rawData.pData_,rawData.size_);
                        }
                    }
                } break;

                default: break;
            }

            // Move to the next box.
            io_->seek(static_cast<long>(position - sizeof(box) + box.length), BasicIo::beg);
            if (io_->error()) throw Error(14);
            if ( bPrint ) lf(out,bLF);
        }
    }
} // JpegBase::printStructure

On 20 Oct 2017, at 17:37, Henry Borchers notifications@github.com wrote:

Thank you!!! Sadly, got dragged into other work today. I'll check it first thing Monday.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/Exiv2/exiv2/issues/129#issuecomment-338338321, or mute the thread https://github.com/notifications/unsubscribe-auth/AAgWPjuknLhzO7iSR9BB4g25NT3ii3OUks5suSDFgaJpZM4P_jOu.

clanmills commented 7 years ago

pyexiv2 was declared obsolete about the time that python3 was starting to emerge. py3exiv2 is linux only. Yup, porting is work.

Now you’re getting the idea about the scale of supporting an open-source library such as Exiv2. We support 18 platforms, test harness, buildserver, 3 build environments, 120k lines of C++, 19 image formats, 15 types of IO (local files, MM, http etc), 11 camera manufacturers, 4 families of metadata (Exif, IPTC, XMP and ICC), 8 architectures {32|64} x {static|shared} x {release|debug}

Guess why we keep out of the wrapper business for scripting languages such as python (2 and 3), perl (5 and 6), COM, .Net, java, nodejs and so on?

On 19 Oct 2017, at 14:21, Henry Borchers notifications@github.com wrote:

oh now I remember. pyexiv2 doesn't work with Python 3+ and py3exiv2 is linux only.

Anyways. I sent you a email with a link to a couple collection using our Box account. Thank you for taking a look.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/Exiv2/exiv2/issues/129#issuecomment-338010309, or mute the thread https://github.com/notifications/unsubscribe-auth/AAgWPmiS9bjCY-h3zPOEzZdo2obuWzlhks5st6EfgaJpZM4P_jOu.

D4N commented 7 years ago

@clanmills where do you want your patch merged? On the master branch or 0.26?

On October 21, 2017 1:21:09 AM GMT+02:00, Robin Mills notifications@github.com wrote:

Henry

Don’t worry about having other stuff to do. I’m pleased that we’re making progress.

You may have to patch the code manually (revision below). Exiv2 moved to GitHub when we released v0.26 in April. I still don’t understand git and GitHub demanded a reviewed PR. I’ll ask one of the team to submit the change for me. I’m pleased to say that the other members of Team Exiv2 are all strong on git (and many many other matters).

So relax. Have a nice weekend. I expect we’ll talk next week when I’ll be in California to visit my old buddies in Silicon Valley. (I retired 3 years ago and now live in England).

Robin

Here’s the modified function that you need in src/jp2image.cpp

void Jp2Image::printStructure(std::ostream& out, PrintStructureOption option,int depth) { if (io->open() != 0) throw Error(9, io->path(), strError());

   // Ensure that this is the correct image type
   if (!isJp2Type(*io_, false)) {
       if (io_->error() || io_->eof()) throw Error(14);
       throw Error(15);
   }

   bool bPrint     = option == kpsBasic || option==kpsRecursive;
   bool bRecursive = option == kpsRecursive;
   bool bICC       = option == kpsIccProfile;
   bool bXMP       = option == kpsXMP;
   bool bIPTCErase = option == kpsIptcErase;

   if ( bPrint ) {
out << "STRUCTURE OF JPEG2000 FILE: " << io_->path() << std::endl;
    out << " address |   length | box       | data" << std::endl ;
   }

   if ( bPrint || bXMP || bICC || bIPTCErase ) {

       long              position  = 0;
       Jp2BoxHeader      box       = {1,1};
       Jp2BoxHeader      subBox    = {1,1};
Jp2UuidBox        uuid      = {{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}};
       bool              bLF       = false;

while (box.length && box.type != kJp2BoxTypeClose && io->read((byte*)&box, sizeof(box)) == sizeof(box)) { position = io->tell(); box.length = getLong((byte)&box.length, bigEndian); box.type = getLong((byte)&box.type, bigEndian);

           if ( bPrint ) {

out << Internal::stringFormat("%8ld | %8ld | ",position-sizeof(box),box.length) << toAscii(box.type) << " | " ; bLF = true ; if ( box.type == kJp2BoxTypeClose ) lf(out,bLF); } if ( box.type == kJp2BoxTypeClose ) break;

           switch(box.type)
           {
               case kJp2BoxTypeJp2Header:
               {
                   lf(out,bLF);

while (io_->read((byte*)&subBox, sizeof(subBox)) == sizeof(subBox)

&& io->tell() < position + (long) box.length) // don't read beyond the box! { int address = io->tell() - sizeof(subBox); subBox.length = getLong((byte)&subBox.length, bigEndian); subBox.type = getLong((byte)&subBox.type, bigEndian);

                       DataBuf data(subBox.length-sizeof(box));
                       io_->read(data.pData_,data.size_);
                       if ( bPrint ) {

out << Internal::stringFormat("%8ld | %8ld | sub:",address,subBox.length) << toAscii(subBox.type) <<" | " << Internal::binaryToString(data,30,0); bLF = true; }

                       if(subBox.type == kJp2BoxTypeColorHeader)
                       {
        long pad = 3 ; // don't know why there are 3 padding bytes
                           if ( bPrint ) {
                               out << " | pad:" ;
for ( int i = 0 ; i < 3 ; i++ ) out<< " " << (int) data.pData_[i];
                           }
         long    iccLength = getULong(data.pData_+pad, bigEndian);
                           if ( bPrint ) {
                             out << " | iccLength:" << iccLength ;
                           }
    if ( bICC ) out.write((const char*)data.pData_+pad,iccLength);
                       }
                       lf(out,bLF);
                   }
               } break;

               case kJp2BoxTypeUuid:
               {

        if (io_->read((byte*)&uuid, sizeof(uuid)) == sizeof(uuid))
                   {

bool bIsExif = memcmp(uuid.uuid, kJp2UuidExif, sizeof(uuid))==0; bool bIsIPTC = memcmp(uuid.uuid, kJp2UuidIptc, sizeof(uuid))==0; bool bIsXMP = memcmp(uuid.uuid, kJp2UuidXmp , sizeof(uuid))==0;

               bool    bUnknown= ! (bIsExif || bIsIPTC || bIsXMP);

                       if ( bPrint ) {
                           if ( bIsExif ) out << "Exif: " ;
                           if ( bIsIPTC ) out << "IPTC: " ;
                           if ( bIsXMP  ) out << "XMP : " ;
                           if ( bUnknown) out << "????: " ;
                       }

                       DataBuf rawData;
               rawData.alloc(box.length-sizeof(uuid)-sizeof(box));
       long    bufRead = io_->read(rawData.pData_, rawData.size_);
                       if (io_->error()) throw Error(14);
                    if (bufRead != rawData.size_) throw Error(20);

                       if ( bPrint ){
                    out << Internal::binaryToString(rawData,40,0);
                           out.flush();
                       }
                       lf(out,bLF);

                    if(bIsExif && bRecursive && rawData.size_ > 0)
                       {
                if ( (rawData.pData_[0]      == rawData.pData_[1])
          &&   (rawData.pData_[0]=='I' || rawData.pData_[0]=='M' )
                               ) {

BasicIo::AutoPtr p = BasicIo::AutoPtr(new MemIo(rawData.pData,rawData.size)); printTiffStructure(*p,out,option,depth); } }

                       if(bIsIPTC && bRecursive)
                       {
 IptcData::printStructure(out,rawData.pData_,rawData.size_,depth);
                       }

                       if( bIsXMP && bXMP )
                       {
             out.write((const char*)rawData.pData_,rawData.size_);
                       }
                   }
               } break;

               default: break;
           }

           // Move to the next box.

io_->seek(staticcast(position - sizeof(box) + box.length), BasicIo::beg); if (io->error()) throw Error(14); if ( bPrint ) lf(out,bLF); } } } // JpegBase::printStructure

On 20 Oct 2017, at 17:37, Henry Borchers notifications@github.com wrote:

Thank you!!! Sadly, got dragged into other work today. I'll check it first thing Monday.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/Exiv2/exiv2/issues/129#issuecomment-338338321, or mute the thread https://github.com/notifications/unsubscribe-auth/AAgWPjuknLhzO7iSR9BB4g25NT3ii3OUks5suSDFgaJpZM4P_jOu.

clanmills commented 7 years ago

@D4N Thanks. Can you submit the following code into jp2image.cpp please:

    void Jp2Image::printStructure(std::ostream& out, PrintStructureOption option,int depth)
    {
        if (io_->open() != 0) throw Error(9, io_->path(), strError());

        // Ensure that this is the correct image type
        if (!isJp2Type(*io_, false)) {
            if (io_->error() || io_->eof()) throw Error(14);
            throw Error(15);
        }

        bool bPrint     = option == kpsBasic || option==kpsRecursive;
        bool bRecursive = option == kpsRecursive;
        bool bICC       = option == kpsIccProfile;
        bool bXMP       = option == kpsXMP;
        bool bIPTCErase = option == kpsIptcErase;

        if ( bPrint ) {
            out << "STRUCTURE OF JPEG2000 FILE: " << io_->path() << std::endl;
            out << " address |   length | box       | data" << std::endl ;
        }

        if ( bPrint || bXMP || bICC || bIPTCErase ) {

            long              position  = 0;
            Jp2BoxHeader      box       = {1,1};
            Jp2BoxHeader      subBox    = {1,1};
            Jp2UuidBox        uuid      = {{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}};
            bool              bLF       = false;

            while (box.length && box.type != kJp2BoxTypeClose && io_->read((byte*)&box, sizeof(box)) == sizeof(box))
            {
                position   = io_->tell();
                box.length = getLong((byte*)&box.length, bigEndian);
                box.type   = getLong((byte*)&box.type, bigEndian);

                if ( bPrint ) {
                    out << Internal::stringFormat("%8ld | %8ld | ",position-sizeof(box),box.length) << toAscii(box.type) << "      | " ;
                    bLF = true ;
                    if ( box.type == kJp2BoxTypeClose ) lf(out,bLF);
                }
                if ( box.type == kJp2BoxTypeClose ) break;

                switch(box.type)
                {
                    case kJp2BoxTypeJp2Header:
                    {
                        lf(out,bLF);

                        while (io_->read((byte*)&subBox, sizeof(subBox)) == sizeof(subBox)
                               && io_->tell() < position + (long) box.length) // don't read beyond the box!
                        {
                            int address = io_->tell() - sizeof(subBox);
                            subBox.length = getLong((byte*)&subBox.length, bigEndian);
                            subBox.type   = getLong((byte*)&subBox.type, bigEndian);

                            DataBuf data(subBox.length-sizeof(box));
                            io_->read(data.pData_,data.size_);
                            if ( bPrint ) {
                                out << Internal::stringFormat("%8ld | %8ld |  sub:",address,subBox.length) << toAscii(subBox.type)
                                <<" | " << Internal::binaryToString(data,30,0);
                                bLF = true;
                            }

                            if(subBox.type == kJp2BoxTypeColorHeader)
                            {
                                long pad = 3 ; // don't know why there are 3 padding bytes
                                if ( bPrint ) {
                                    out << " | pad:" ;
                                    for ( int i = 0 ; i < 3 ; i++ ) out<< " " << (int) data.pData_[i];
                                }
                                long    iccLength = getULong(data.pData_+pad, bigEndian);
                                if ( bPrint ) {
                                    out << " | iccLength:" << iccLength ;
                                }
                                if ( bICC ) out.write((const char*)data.pData_+pad,iccLength);
                            }
                            lf(out,bLF);
                        }
                    } break;

                    case kJp2BoxTypeUuid:
                    {

                        if (io_->read((byte*)&uuid, sizeof(uuid)) == sizeof(uuid))
                        {
                            bool    bIsExif = memcmp(uuid.uuid, kJp2UuidExif, sizeof(uuid))==0;
                            bool    bIsIPTC = memcmp(uuid.uuid, kJp2UuidIptc, sizeof(uuid))==0;
                            bool    bIsXMP  = memcmp(uuid.uuid, kJp2UuidXmp , sizeof(uuid))==0;

                            bool    bUnknown= ! (bIsExif || bIsIPTC || bIsXMP);

                            if ( bPrint ) {
                                if ( bIsExif ) out << "Exif: " ;
                                if ( bIsIPTC ) out << "IPTC: " ;
                                if ( bIsXMP  ) out << "XMP : " ;
                                if ( bUnknown) out << "????: " ;
                            }

                            DataBuf rawData;
                            rawData.alloc(box.length-sizeof(uuid)-sizeof(box));
                            long    bufRead = io_->read(rawData.pData_, rawData.size_);
                            if (io_->error()) throw Error(14);
                            if (bufRead != rawData.size_) throw Error(20);

                            if ( bPrint ){
                                out << Internal::binaryToString(rawData,40,0);
                                out.flush();
                            }
                            lf(out,bLF);

                            if(bIsExif && bRecursive && rawData.size_ > 0)
                            {
                                if ( (rawData.pData_[0]      == rawData.pData_[1])
                                    &&   (rawData.pData_[0]=='I' || rawData.pData_[0]=='M' )
                                    ) {
                                    BasicIo::AutoPtr p = BasicIo::AutoPtr(new MemIo(rawData.pData_,rawData.size_));
                                    printTiffStructure(*p,out,option,depth);
                                }
                            }

                            if(bIsIPTC && bRecursive)
                            {
                                IptcData::printStructure(out,rawData.pData_,rawData.size_,depth);
                            }

                            if( bIsXMP && bXMP )
                            {
                                out.write((const char*)rawData.pData_,rawData.size_);
                            }
                        }
                    } break;

                    default: break;
                }

                // Move to the next box.
                io_->seek(static_cast<long>(position - sizeof(box) + box.length), BasicIo::beg);
                if (io_->error()) throw Error(14);
                if ( bPrint ) lf(out,bLF);
            }
        }
    } // JpegBase::printStructure
D4N commented 7 years ago

Could you commit the changes locally and then use git format-patch -1 $commit_hash (where you replace $commit_hash with the hash of the commit adding the requested changes) and email me the resulting patch file? I can then apply the commit and submit a PR. That will preserve the commit information (like author & time).

Also, should this be applied to master or 0.26?

Robin Mills notifications@github.com writes:

@D4 Thanks. Can you submit the following code into jp2image.cpp please:

    void Jp2Image::printStructure(std::ostream& out, PrintStructureOption option,int depth)
    {
        if (io_->open() != 0) throw Error(9, io_->path(), strError());

        // Ensure that this is the correct image type
        if (!isJp2Type(*io_, false)) {
            if (io_->error() || io_->eof()) throw Error(14);
            throw Error(15);
        }

        bool bPrint     = option == kpsBasic || option==kpsRecursive;
        bool bRecursive = option == kpsRecursive;
        bool bICC       = option == kpsIccProfile;
        bool bXMP       = option == kpsXMP;
        bool bIPTCErase = option == kpsIptcErase;

        if ( bPrint ) {
            out << "STRUCTURE OF JPEG2000 FILE: " << io_->path() << std::endl;
            out << " address |   length | box       | data" << std::endl ;
        }

        if ( bPrint || bXMP || bICC || bIPTCErase ) {

            long              position  = 0;
            Jp2BoxHeader      box       = {1,1};
            Jp2BoxHeader      subBox    = {1,1};
            Jp2UuidBox        uuid      = {{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}};
            bool              bLF       = false;

            while (box.length && box.type != kJp2BoxTypeClose && io_->read((byte*)&box, sizeof(box)) == sizeof(box))
            {
                position   = io_->tell();
                box.length = getLong((byte*)&box.length, bigEndian);
                box.type   = getLong((byte*)&box.type, bigEndian);

                if ( bPrint ) {
                    out << Internal::stringFormat("%8ld | %8ld | ",position-sizeof(box),box.length) << toAscii(box.type) << "      | " ;
                    bLF = true ;
                    if ( box.type == kJp2BoxTypeClose ) lf(out,bLF);
                }
                if ( box.type == kJp2BoxTypeClose ) break;

                switch(box.type)
                {
                    case kJp2BoxTypeJp2Header:
                    {
                        lf(out,bLF);

                        while (io_->read((byte*)&subBox, sizeof(subBox)) == sizeof(subBox)
                               && io_->tell() < position + (long) box.length) // don't read beyond the box!
                        {
                            int address = io_->tell() - sizeof(subBox);
                            subBox.length = getLong((byte*)&subBox.length, bigEndian);
                            subBox.type   = getLong((byte*)&subBox.type, bigEndian);

                            DataBuf data(subBox.length-sizeof(box));
                            io_->read(data.pData_,data.size_);
                            if ( bPrint ) {
                                out << Internal::stringFormat("%8ld | %8ld |  sub:",address,subBox.length) << toAscii(subBox.type)
                                <<" | " << Internal::binaryToString(data,30,0);
                                bLF = true;
                            }

                            if(subBox.type == kJp2BoxTypeColorHeader)
                            {
                                long pad = 3 ; // don't know why there are 3 padding bytes
                                if ( bPrint ) {
                                    out << " | pad:" ;
                                    for ( int i = 0 ; i < 3 ; i++ ) out<< " " << (int) data.pData_[i];
                                }
                                long    iccLength = getULong(data.pData_+pad, bigEndian);
                                if ( bPrint ) {
                                    out << " | iccLength:" << iccLength ;
                                }
                                if ( bICC ) out.write((const char*)data.pData_+pad,iccLength);
                            }
                            lf(out,bLF);
                        }
                    } break;

                    case kJp2BoxTypeUuid:
                    {

                        if (io_->read((byte*)&uuid, sizeof(uuid)) == sizeof(uuid))
                        {
                            bool    bIsExif = memcmp(uuid.uuid, kJp2UuidExif, sizeof(uuid))==0;
                            bool    bIsIPTC = memcmp(uuid.uuid, kJp2UuidIptc, sizeof(uuid))==0;
                            bool    bIsXMP  = memcmp(uuid.uuid, kJp2UuidXmp , sizeof(uuid))==0;

                            bool    bUnknown= ! (bIsExif || bIsIPTC || bIsXMP);

                            if ( bPrint ) {
                                if ( bIsExif ) out << "Exif: " ;
                                if ( bIsIPTC ) out << "IPTC: " ;
                                if ( bIsXMP  ) out << "XMP : " ;
                                if ( bUnknown) out << "????: " ;
                            }

                            DataBuf rawData;
                            rawData.alloc(box.length-sizeof(uuid)-sizeof(box));
                            long    bufRead = io_->read(rawData.pData_, rawData.size_);
                            if (io_->error()) throw Error(14);
                            if (bufRead != rawData.size_) throw Error(20);

                            if ( bPrint ){
                                out << Internal::binaryToString(rawData,40,0);
                                out.flush();
                            }
                            lf(out,bLF);

                            if(bIsExif && bRecursive && rawData.size_ > 0)
                            {
                                if ( (rawData.pData_[0]      == rawData.pData_[1])
                                    &&   (rawData.pData_[0]=='I' || rawData.pData_[0]=='M' )
                                    ) {
                                    BasicIo::AutoPtr p = BasicIo::AutoPtr(new MemIo(rawData.pData_,rawData.size_));
                                    printTiffStructure(*p,out,option,depth);
                                }
                            }

                            if(bIsIPTC && bRecursive)
                            {
                                IptcData::printStructure(out,rawData.pData_,rawData.size_,depth);
                            }

                            if( bIsXMP && bXMP )
                            {
                                out.write((const char*)rawData.pData_,rawData.size_);
                            }
                        }
                    } break;

                    default: break;
                }

                // Move to the next box.
                io_->seek(static_cast<long>(position - sizeof(box) + box.length), BasicIo::beg);
                if (io_->error()) throw Error(14);
                if ( bPrint ) lf(out,bLF);
            }
        }
    } // JpegBase::printStructure

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/Exiv2/exiv2/issues/129#issuecomment-338475631

henryborchers commented 7 years ago

@clanmills , I completely understand why you keep your scope to the C++ domain. It's a lot of work to support an open source project like this, especially one of this quality.

However, I need to access embedded and technical metadata data using Python and nothing does what I need it there. Exiv2 is the only thing that I've found that can handle images produced by archives. So that's why I decided to build a crude Python extension that accesses just the functionality I need from Exiv2.

henryborchers commented 7 years ago

@D4N, I didn't see you get an answer to your question. Did you go ahead and add it to any of the branches? Don't mean to be pushy, just excited to give this a whirl.

clanmills commented 7 years ago

Henry: There's a sample application "exiv2json" that reads the metadata in an image and writes a JSON packet. I believe you're only reading metadata. You might find this very convenient to read an image use you favourite python/JSON reader. I think you'll be able to use popen3 to read the metadata very quickly into python without having to write a line of C++.

henryborchers commented 7 years ago

@clanmills , Honestly, I really like the excuse to write C++. Not to mention, writing a cross-platform C++ binding to Python stupid easy with combination of Pybind11. Exiv's API documentation and examples make this really easy. With CMake, and Scikit-build Python's setup tools can literally build it all itself.

Thanks for the suggestion about "exiv2json" If this was just for my computer, I would do something like you suggested just like this. However, I need to deploy my packaged tools to our the processing archivists' workstations. A big part of my job is building in-house tools to help their make their workflows manageable. While I do a little of the grunt work myself. I need to make these tools accessible so that our library students can use them.

In the past, I bundled the precompiled Exiv2 binaries whenever I deployed our in-house Python scripts to the employee workstations. Due to the computer security restrictions of the university, this gets super messy and hard to maintain. It's actually much easier for me to deploy a Python Wheel (A Python package that allows for bundled binary data) bundled with "Python style" shared library dependencies than packaging an executable.

Also, I've already completed 90% of what I wanted from a binding for this project. I just need to be able to access the colorspace of the image.

Thanks for the suggestion. @clanmills, I've been telling every one here about how impressed I've been by how helpful you are.