evanoberholster / imagemeta

Image Metadata (Exif and XMP) extraction for JPEG, HEIC, AVIF, TIFF and Camera Raw in golang. Focus is on providing features and improved performance.
MIT License
116 stars 12 forks source link

PNG image inclussion #41

Open maggie-vegaa opened 2 years ago

maggie-vegaa commented 2 years ago

Hi! Thanks for making this package, it is very useful, i was wondering if there is a chance to add PNG images into the library, i was really looking for it

evanoberholster commented 2 years ago

What specifically are you needing in PNG files?

If you have a recommendation we can certainly consider it.

maggie-vegaa commented 2 years ago

I am needing exif and xmp metadata from the image

geocine commented 1 year ago

You should be able to download this PNG and see the exif using exiftool.exe . I am mainly concerned about retrieving the Parameters

dinosaur

ExifTool Version Number         : 12.49
File Name                       : 211167276-7a343a6a-ba01-4d1b-b094-82f6746b5335.png
Directory                       : .
File Size                       : 481 kB
Zone Identifier                 : Exists
File Modification Date/Time     : 2023:01:08 03:26:35+08:00
File Access Date/Time           : 2023:01:08 03:26:44+08:00
File Creation Date/Time         : 2023:01:08 03:26:34+08:00
File Permissions                : -rw-rw-rw-
File Type                       : PNG
File Type Extension             : png
MIME Type                       : image/png
Image Width                     : 512
Image Height                    : 512
Bit Depth                       : 8
Color Type                      : RGB
Compression                     : Deflate/Inflate
Filter                          : Adaptive
Interlace                       : Noninterlaced
Parameters                      : zwx artstyle, a (dinosaur), (stone carving), intricate background texture,  furnished, detailed stone carving, stone art, [[gold outlines]], colorful accents, global illumination, masterpiece, award winning, (square frame).Negative prompt: disfigured, bright, blur, haze, low quality, (((hollow))), shadows, void,  ((out of frame)) , plain, circle, circular.Steps: 100, Sampler: Euler a, CFG scale: 8, Seed: 2693542571, Size: 512x512, Model hash: b28fcc50
Image Size                      : 512x512
Megapixels                      : 0.262
abrander commented 1 year ago

You should be able to download this PNG and see the exif using exiftool.exe. I am mainly concerned about retrieving the Parameters

That PNG file does not contain EXIF data. The Parameters you see in the output from exiftool is located in the PNG tEXt chunk - exiftool -v shows this.

I know this doesn't solve your problem, but I wanted to point out that the specific PNG sample doesn't contain EXIF data.

geocine commented 1 year ago

Hi @abrander thanks for pointing that out. At least I now have a direction to look into as I have been trying different ways and nothing is working because I didn't know the actual root cause. I will be looking into the PNG tEXt chunk.

I found a resource http://www.libpng.org/pub/png/spec/1.2/PNG-Chunks.html#C.tEXt

abrander commented 1 year ago

@geocine,

I will be looking into the PNG tEXt chunk.

You can have a look at the PNG parser in #46, that parser can be modified to extract text chunks instead of exif chunks simply by replacing the case "exif": by a case "tEXt": and reading length bytes into a string and splitting by the null seperator. I'm on mobile now, bit I'll be back at a real computer tomorrow, and I'll be happy to write an example for you, if you would like that.

geocine commented 1 year ago

@abrander Thanks I somehow figured it out based on your comment. Is this correct?

package main

import (
    "bytes"
    "encoding/binary"
    "errors"
    "fmt"
    "io"
    "os"
)

func ScanPngHeader(r io.ReadSeeker) (result string, err error) {
    // 5.2 PNG signature
    const signature = "\x89PNG\r\n\x1a\n"

    // 5.3 Chunk layout
    const crcSize = 4

    // 8 is the size of both the signature and the chunk
    // id (4 bytes) + chunk length (4 bytes).
    // This is just a coincidence.
    buf := make([]byte, 8)

    var n int
    n, err = r.Read(buf)
    if err != nil {
        print("error: ", err)
        return "", err
    }

    if n != len(signature) || string(buf) != signature {
        print("invalid PNG signature")
        return "", errors.New("invalid PNG signature")
    }

    for {
        n, err = r.Read(buf)
        if err != nil {
            break
        }

        if n != len(buf) {
            break
        }

        length := binary.BigEndian.Uint32(buf[0:4])
        chunkType := string(buf[4:8])
        switch chunkType {
        case "tEXt":
            print("found tEXt chunk\n")

            data := make([]byte, length)
            _, err := r.Read(data)
            if err != nil {
                return "", err
            }

            separator := []byte{0}
            separatorIndex := bytes.Index(data, separator)
            if separatorIndex == -1 {
                return "", errors.New("invalid tEXt chunk")
            }
            return string(data[separatorIndex+1:]), nil

        default:
            // Discard the chunk length + CRC.
            _, err := r.Seek(int64(length+crcSize), io.SeekCurrent)
            if err != nil {
                return "", err
            }
        }
    }

    return "", nil
}

func main() {
    imgFile, err := os.Open("test.png")
    if err != nil {
        panic(err)
    }
    defer imgFile.Close()
    text, _ := ScanPngHeader(imgFile)
    fmt.Println("text: ", text)
}
evanoberholster commented 1 year ago

@geocine and @maggie-vegaa. Currently working on updates in develop branch, this has been included. Does this solve your issue?

geocine commented 1 year ago

Yes @evanoberholster. Not directly but yes when EXIF is concerned.

abrander commented 1 year ago

@geocine,

Thanks I somehow figured it out based on your comment. Is this correct?

This looks good. A few notes, thou:

  1. A PNG file can contain multiple tEXt chunks. You may want to read them all (also remember to skip the 4 CRC bytes, if you're interested.
  2. The tEXt chunk consists of two parts. You should keep the first part too.

I hope this helps you solve your problem. Unless @evanoberholster wants to add tEXt-reading to this package, I think we should continue the conversation by email, if you have further comments.

geocine commented 1 year ago

Thanks @abrander for the help. I don't want to pollute this issue further.

evanoberholster commented 1 year ago

@abrander and @geocine

I think it would be useful to add PNG metadata parsing to this package. The goal of the package is to deal with as much image metadata as possible.

Feel free to continue the discussion on this issue as it can be helpful to others sealing a similar solution.