Exiv2 / exiv2

Image metadata library and tools
http://www.exiv2.org/
Other
911 stars 280 forks source link

API (possibly) not supplying full preview data #16

Closed atwright147 closed 7 years ago

atwright147 commented 7 years ago

Hey,

@myfreeweb and I are currently trying to debug an issue in his fork of exiv2node.

I have been finding that previews extracted via his fork and the original package are producing preview which are missing a little bit of data from the end.

We are wondering do you do some post-processing when the CLI tool extracts the image that we need to replicate in the package or is there a bug in the API?

This is the issue where we have been discussing this: https://github.com/myfreeweb/exiv2node/issues/2

Here is a copy of the diff I generated:

screen shot 2017-07-03 at 18 26 00
clanmills commented 7 years ago

Gentlemen: Can you attach a test file to this issue, please? I'm very surprised by what's being discussed. I will investigate this today when you provide a suitable test image.

clanmills commented 7 years ago

I've found your test file source.dng

547 rmills@rmillsmbp:~/Downloads $ exiv2 -pp source.dng  
Error: Directory Canon with 25665 entries considered invalid; not read.
Preview 1: image/tiff, 256x171 pixels, 131328 bytes
Preview 2: image/tiff, 1024x683 pixels, 94841 bytes
548 rmills@rmillsmbp:~/Downloads $ exiv2 -ep1,2 --force --verbose source.dng  
File 1/1: source.dng
Error: Directory Canon with 25665 entries considered invalid; not read.
Writing preview 1 (image/tiff, 256x171 pixels, 131468 bytes) to file ./source-preview1.tif
Writing preview 2 (image/tiff, 1024x683 pixels, 95102 bytes) to file ./source-preview2.tif
549 rmills@rmillsmbp:~/Downloads $ 

The image is extracted (as a binary blob) here:

    PreviewImage PreviewManager::getPreviewImage(const PreviewProperties &properties) const
    {
        Loader::AutoPtr loader = Loader::create(properties.id_, image_);
        DataBuf buf;
        if (loader.get()) {
            buf = loader->getData();
        }

        return PreviewImage(properties, buf);
    }

And the preview file is written by this code which is effectively fopen(path,"wb"); fwrite ; fclose.

    long PreviewImage::writeFile(const std::string& path) const
    {
        std::string name = path + extension();
        // Todo: Creating a DataBuf here unnecessarily copies the memory
        DataBuf buf(pData_, size_);
        return Exiv2::writeFile(buf, name);
    }
atwright147 commented 7 years ago

Sorry, I should have added that. When I get home I will attach source.dng (at least for ease of use and for posterity).

atwright147 commented 7 years ago

Do you have any thoughts about what the issue is? This sort of thing is usually me being stupid :)

@myfreeweb is the expert on how the C Bindings work but what you say seems right to me.

clanmills commented 7 years ago

I haven't looked in your code. It appears to me that the library extracts the preview and writes it to file in a simple operation. Without reproducing your code, I have nothing to investigate. I have attached your file. source.zip

valpackett commented 7 years ago

https://github.com/atwright147/exiv2node-test includes the dng file

https://github.com/myfreeweb/exiv2node/blob/41a831f33a158b8f67e8ecc31d5b6f3e190c5045/exiv2node.cc#L342-L361 the C++ code

clanmills commented 7 years ago

Thanks for sharing your code. Here's the code in src/actions.cpp which is used by the exiv2(.exe) command-line program:

    int Extract::writePreviews() const
    {
        if (!Exiv2::fileExists(path_, true)) {
            std::cerr << path_
                      << ": " << _("Failed to open the file\n");
            return -1;
        }
        Exiv2::Image::AutoPtr image = Exiv2::ImageFactory::open(path_);
        assert(image.get() != 0);
        image->readMetadata();

        Exiv2::PreviewManager pvMgr(*image);
        Exiv2::PreviewPropertiesList pvList = pvMgr.getPreviewProperties();

        const Params::PreviewNumbers& numbers = Params::instance().previewNumbers_;
        for (Params::PreviewNumbers::const_iterator n = numbers.begin(); n != numbers.end(); ++n) {
            if (*n == 0) {
                // Write all previews
                for (int num = 0; num < static_cast<int>(pvList.size()); ++num) {
                    writePreviewFile(pvMgr.getPreviewImage(pvList[num]), num + 1);
                }
                break;
            }
            if (*n > static_cast<int>(pvList.size())) {
                std::cerr << path_ << ": " << _("Image does not have preview")
                          << " " << *n << "\n";
                continue;
            }
            writePreviewFile(pvMgr.getPreviewImage(pvList[*n - 1]), *n);
        }
        return 0;
    } // Extract::writePreviews

I'm wondering if you're corrupting buffers or storing exiv2 pointers in your vector. Be careful, these are smart pointers. Be sure to allocate your own memory to store the preview in your environment.

I have to shoot off at the moment. If you're confused, please shout and I'll go through your code later.

valpackett commented 7 years ago

(char*) image.pData() doesn't look smart to me… and it's memcpy'd in the Preview constructor https://github.com/myfreeweb/exiv2node/blob/41a831f33a158b8f67e8ecc31d5b6f3e190c5045/exiv2node.cc#L397-L398

clanmills commented 7 years ago

I looked quickly at your code and saw the new data[size] stuff. You do appear to be copying the image. Let's not get into a smart argument, let's focus on finding what's wrong. Is it possible to sprintf/log as this code executes. For example write the preview from the library immediately when you retrieve it (using a something like actions.cpp/writePreviewFile). Compare what the library delivered with is delivered later in the JavaScript/node environment.

Can you attach the image being delivered by JavaScript/node. You might (or you might not) find the command $ exiv2 -pR image useful for debugging image files. For sure, I'll have a "sniff" at your file with that tool.

valpackett commented 7 years ago

okay, found the issue. pos->size_ is smaller than image.size()

clanmills commented 7 years ago

Ah, ha. I'm so happy. Time for a beer in the garden (25C in Camberley).

valpackett commented 7 years ago

Why is that though? Why does the Exiv2::PreviewPropertiesList::iterator have incorrect data?

atwright147 commented 7 years ago

Wahoo! You guys are both pretty damn awesome.

@clanmills Sorry, I did a terrible job of giving you the resources you needed to help with this. Thank you so, so much for your help, I am really excited to make use of this. :)

clanmills commented 7 years ago

Thanks, Andy. I'm delighted that JavaScript/node can use Exiv2. Thank you for making this possible.

I'll reopen this issue to investigate the puzzle about PreviewPropertiesList::integrator::size_ and image::size(). Andreas wrote the preview code (and indeed, most of Exiv2). Although I have been contributing for 10 years, there are still many areas which I have never explored.

Thank You for using Exiv2.

clanmills commented 7 years ago

There is a bug in Exiv2 concerning PreviewPropertiesList::integrator::size_ and image::size(). I will submit a fix for it this evening as I don't have enough time this morning to totally understand the code and determine the best fix.

The size of the preview image is only correctly determined when the call is made to PreviewManager::getPreviewImage() and the size of the image is determined by image.size(). I think the value of size_ in the results of PreviewManager::getPreviewProperties() is being incorrectly set to the StripByteCounts of the preview subimage.

579 rmills@rmillsmbp:~/Downloads $ exiv2 --force --verbose -ep1,2 source.dng 2>/dev/null
File 1/1: source.dng
Writing preview 1 (image/tiff, 256x171 pixels, 131468 bytes) to file ./source-preview1.tif
Writing preview 2 (image/tiff, 1024x683 pixels, 95102 bytes) to file ./source-preview2.tif
580 rmills@rmillsmbp:~/Downloads $ exiv2 -pR ~/Downloads/source.dng 2>/dev/null  | grep StripByteCounts
     154 | 0x0117 StripByteCounts           |      LONG |        1 |    131328 | 131328
    199138 | 0x0117 StripByteCounts           |      LONG |        1 |     94841 | 94841
581 rmills@rmillsmbp:~/Downloads $ 

I need to approach the fix with caution as this may require a change in the Exiv2 API PreviewManager::getPreviewImage(const PreviewProperties &properties) to remove the const. Another approach is to modify getPreviewProperties() to call getPreviewImage() and set the size_ correctly. Another approach is to make size private:_ in the various preview classes and this would make cause the original code by @myfreeweb to not compile.

This is a subtle bug. In JPEG and PNG files, the preview is stored as a binary "blob" in the image. The size is the size of the blob. The Canon DNG files are effectively TIFF files and use a subfile to store the preview, and determining the image `sizeinvolves reading the subfile. I think the correct fix is to study howsize_is set ingetPreviewProperties()` and fix it there. Such a fix would avoid changes to the API and cause the original code by @myfreeweb to run correctly without modification.