davidbyttow / govips

A lightning fast image processing and resizing library for Go
MIT License
1.28k stars 199 forks source link

Improved exif extraction #325

Closed AttilaTheFun closed 1 year ago

AttilaTheFun commented 1 year ago

This PR adds several APIs that exist already at the libvips layer but were not exposed by govips which allow us to read and write EXIF fields. For images like JPEGs which support regular EXIF data, the exif-data image field can be read out as a blob or the individual fields can be parsed via their respective field names. For images like TIFFs without direct EXIF support, you can also extract the newer XMP metadata as a blob.

Also while I was testing with TIFF images I found a bug where TIFF images always defaulted to a resolution 1.0 Pixels-Per-Millimeters (about 25PPI). The resolution doesn't really affect the image itself -- just how large it should be printed. The canonical way to change this with libvips is to copy the image changing the xres / yres fields to the desired value. I removed the strange TIFF behavior so that the original resolution (different than pixel dimensions) will be preserved and it can be set to the desired value with the new API.

tonimelisma commented 1 year ago

Thanks so much @AttilaTheFun. There's also an existing bug with removing metadata, it's stopped working on macOS. Is there any way you could try and debug it?

AttilaTheFun commented 1 year ago

Hi @tonimelisma ! Could you provide me some steps to reproduce the bug? I'll see if it's fixable in this PR or if I could maybe fix it in another pr.

tonimelisma commented 1 year ago

There's a failing unit test (only on MACOS) and an issue I wrote

AttilaTheFun commented 1 year ago

Oh this issue @tonimelisma ? https://github.com/davidbyttow/govips/issues/310 Let me see if I can repro

AttilaTheFun commented 1 year ago

@tonimelisma I was able to repro and I found the issue.

The issue is that there is a minimal set of EXIF fields that libjpeg requires which are always added back by libvips even if you try to remove them. These are required by many image viewers to function -- this includes information like the orientation of the image file: https://github.com/imgproxy/imgproxy/issues/668#issuecomment-883368672

I was able to address this issue by verifying instead that removing the exif data reduced the number of exif fields without eliminating them completely:

// NOTE: The JPEG spec requires some minimal exif data including exif-ifd0-Orientation.
// libvips always adds these fields back but they should not be a privacy concern.
// HEIC images require the same fields and behave the same way in libvips.
func TestImage_RemoveMetadata_Removes_Exif_HEIC(t *testing.T) {
    var initialEXIFCount int
    goldenTest(t, resources+"heic-24bit-exif.heic",
        func(img *ImageRef) error {
            assert.True(t, img.HasExif())
            exifData := img.GetExif()
            initialEXIFCount = len(exifData)
            assert.Greater(t, initialEXIFCount, 0)
            return img.RemoveMetadata()
        },
        func(img *ImageRef) {
            assert.True(t, img.HasExif())
            exifData := img.GetExif()
            finalEXIFCount := len(exifData)
            assert.Less(t, finalEXIFCount, initialEXIFCount)
        }, nil)
}

The number of exif fields here is reduced from 54 to 11.

tonimelisma commented 1 year ago

Hey @AttilaTheFun I like it a lot, but the golden test for remove metadata on ubuntu stopped working now?

AttilaTheFun commented 1 year ago

Hmm @tonimelisma that's weird. Let me try to isolate the test change into its own branch to see if it's something else on my PR that's causing issues.

tonimelisma commented 1 year ago

You can see the C'i logs here: https://github.com/davidbyttow/govips/actions/runs/3401812287/jobs/5658053247

Search for FAIL:

AttilaTheFun commented 1 year ago

@tonimelisma The isolated test fix is up here: https://github.com/davidbyttow/govips/pull/330 Let's see if this still fails on Ubuntu.

AttilaTheFun commented 1 year ago

Hey @tonimelisma ! I figured out the Ubuntu issue and pushed a commit which should hopefully address the issue. More info in my comment here: https://github.com/davidbyttow/govips/pull/330#issuecomment-1305100769

AttilaTheFun commented 1 year ago

So it looks like the RemoveMetadata error was fixed but it's failing on a couple of the TIFF tests. I'll see if I can repro the TIFF issue locally. I think we might just need to accept the new golden images because this PR fixed the weird behavior of setting TIFFs to a resolution of 25ppmm so now the original resolution will be preserved (or the user can change it if they want).

AttilaTheFun commented 1 year ago

@tonimelisma the only failure now is this TIFF test and I'm pretty certain it's because the source image had a resolution of 72 or 300 PPI and the current golden image was converted to 25 PPI because of the strange behavior I fixed and now the new golden image has the same PPI as the source image:

image_golden_test.go:1073: 
[1969](https://github.com/davidbyttow/govips/actions/runs/3407856903/jobs/5667895271#step:8:1970)
            Error Trace:    image_golden_test.go:1073
[1970](https://github.com/davidbyttow/govips/actions/runs/3407856903/jobs/5667895271#step:8:1971)
                                        image_golden_test.go:968
[1971](https://github.com/davidbyttow/govips/actions/runs/3407856903/jobs/5667895271#step:8:1972)
                                        image_golden_test.go:783
[1972](https://github.com/davidbyttow/govips/actions/runs/3407856903/jobs/5667895271#step:8:1973)
            Error:          Should be true
[1973](https://github.com/davidbyttow/govips/actions/runs/3407856903/jobs/5667895271#step:8:1974)
            Test:           TestImage_Tiff
[1974](https://github.com/davidbyttow/govips/actions/runs/3407856903/jobs/5667895271#step:8:1975)
            Messages:       Actual image (size=3068589) didn't match expected golden file=../resources/tif.Tiff-linux-focal.golden.tiff (size=3068589)
[1975](https://github.com/davidbyttow/govips/actions/runs/3407856903/jobs/5667895271#step:8:1976)
--- FAIL: TestImage_Tiff (0.29s)

Is there a way we could download the image from the Ubuntu builder? Or could we just land this and accept the new golden image?

tonimelisma commented 1 year ago

We can accept the new golden image. Can you remove the ubuntu golden and add that as a commit to the PR? The test will pass without the comparison ubuntu image (leave the original reference as is)

AttilaTheFun commented 1 year ago

Oh sure I'll do that shortly!

coveralls commented 1 year ago

Coverage Status

Coverage decreased (-1.5%) to 75.386% when pulling 28b2424a9904adb468accac39c083acc77ac4968 on AttilaTheFun:lshire-improved-exif-extraction into e407ec7186b56d3f9f53457b14f2ad99253cda1d on davidbyttow:master.