adobe / XMP-Toolkit-SDK

The XMP Toolkit allows you to integrate XMP functionality into your product or solution
BSD 3-Clause "New" or "Revised" License
205 stars 83 forks source link

GetXMP() for eps file #64

Closed Bipyc closed 11 months ago

Bipyc commented 2 years ago

Expected Behaviour

GetXMP() returns the correct XMP packet for test.eps, the same XMP packet as after using PacketScanning.

Actual Behaviour

The problem introduced with XMPLib update. In the previous version of XMPLib(5.1.2, year 2011), we called SXMPFiles::GetXMP() and it returned false (does not contain XMP metadata), and we reopened the same SXMPFiles file with kXMPFiles_OpenUsePacketScanning flag and used the same method SXMPFiles::GetXMP(). But in new version of XMPLib, the first call SXMPFiles::GetXMP returns true (contains XMP), so since successful we don't try to reopen the file with PacketScanning. Looks like the new XMPLib started supporting *.eps files without PacketScanning. If PacketScanning is used in the new XMPLib, the result will be the same as in the previous version and will provide more information than without PacketScanning. 450 lines of information using PacketScanning vs 16 lines without PacketScanning in the extracted xmp file.

Our test.eps file contains only one XMP packet, which is located in the middle of the file. XMPLib uses its id. But the rest of the information, for example, xmp:CreatorTool, xmp:CreateDate, is used from the embedded information of the eps file itself. The beginning of the eps file contains embedded eps information. If you change XMP packet, then a new XMP packet will be added to the middle of the file before the old XMP packet and will have the same id. I have tested with other eps file mathematica.eps, which also contains the XMP packet located in the middle of the file. GetXMP() returns the correct XMP packet for mathematica.eps. I also noticed that our test.eps file does not contain the %ADO_ContainsXMP: \

I attached the files : test.eps, withoutPacketScanning.xmp, withPacketScanning.xmp testFiles.zip

Platform and Version

Tested on Windows 11 with latest XMPLib

Sample Code that illustrates the problem

    void
    extractXMPUsingScanner( const QString& sourceFileName, std::string& rawPacket )
    {
        try
        {
            SXMPFiles file;
            if ( file.OpenFile( sourceFileName,
                                kXMP_UnknownFile,
                                kXMPFiles_OpenForRead | kXMPFiles_OpenUsePacketScanning ) )
            {
                file.GetXMP( nullptr, &rawPacket );
                file.CloseFile();
            }
        }
        catch ( const XMP_Error& exception )
        {
            // HandleXMPException( exception, sourceFileName );
        }
    }

        std::string& rawPacket;
        try
        {
            SXMPFiles file;

            XMP_OptionBits openFlags =
                kXMPFiles_OpenForRead | kXMPFiles_OpenStrictly ;

            file.OpenFile( sourceFileName,
                           kXMP_UnknownFile,
                           openFlags );
            bool theRawPacketExtractFailed = !file.GetXMP( nullptr, &rawPacket );

            file.CloseFile();

            if ( theRawPacketExtractFailed )
            {
                // log( "XMP SDK could not extract XMP packet from file '%1'; trying to search XMP using packet scanning" ),
                //         QFileInfo( sourceFileName ).fileName() );
                extractXMPUsingScanner( sourceFileName, rawPacket );
            }
        }
        catch ( const XMP_Error& exception )
        {
            // HandleXMPException( exception, sourceFileName );
        }
maupadhyay commented 11 months ago

EPS file handler look for %ADO_ContainsXMP marker to check whether a main XMP packet is present at all, and how to look for the main XMP. Packet scanning uses more generic approach and parses whole document looking for any raw xmp packet and returns it. Packet scanning is not reliable whenever a file contains multiple XMP packets, or object XMP without main XMP. For more information on XMP behaviour for EPS format, Please refer Section 1.6.2 PS, EPS (PostScript® and Encapsulated PostScript) in XMP SPECIFICATION PART 3