lovell / sharp

High performance Node.js image processing, the fastest module to resize JPEG, PNG, WebP, AVIF and TIFF images. Uses the libvips library.
https://sharp.pixelplumbing.com
Apache License 2.0
28.91k stars 1.29k forks source link

Enhancement: write PNG tEXt chunks #3549

Open vqt123 opened 1 year ago

vqt123 commented 1 year ago

With the MASSIVE generative AI movement, automatic1111 has become a staple tool for image generation. This tool creates images that include lots of data regarding the configuration of any image in the 'tEXT' chunk of an image's metadata. It does not appear that sharp has a way to read or write to this section of the images data.

vqt123 commented 1 year ago

here is an image that includes the 'tEXT' data section

https://imagecache.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/60d49930-c321-4012-e4d9-aee47bc49c00/width=1280

the decoded chunk is as follows:

keyword: paramaters text: highly detailed robot head, clockwork, transparent, filled with colorful liquid, expressive, abstract, trending on artstation, by ryohei hase Negative prompt: (bad-image-v2:0.8) Steps: 20, Sampler: DPM++ 2M, CFG scale: 7, Seed: 756648331, Size: 512x512, Model hash: c35782bad8, Model: general_realisticVisionV13, Denoising strength: 0.4, ENSD: 31337, Hires upscale: 2.5, Hires steps: 8, Hires upscaler: 4x-UltraSharp, Discard penultimate sigma: True

lovell commented 1 year ago

"Artificial Intelligence" hyperbole aside, there are existing metadata standards such as EXIF that might be more suitable for such image generators to use, however tEXt chunk handling is a useful feature for PNG images in its own right.

The tEXt chunks are currently parsed by libspng/libvips into metadata properties with names of the form png-comment-{index}-{keyword}, for example png-comment-0-parameters. The index exists only within libvips to create unique names, as the PNG spec allows multiple tEXt chunks with the same keyword.

I'd suggest something like the following data structure to expose this, both when reading via metadata() or writing via a new property of png().

// PROPOSED API, NOT YET AVAILABLE
comments: [
  { keyword: 'parameters', text: 'highly detailed...' },
  { keyword: 'Copyright', text: 'legal minefield' },
  ...
]

A PR to expose this in sharp would be welcome, if you're able.

mactive commented 1 year ago

I'd like to wait this.

rvion commented 5 months ago

we should also add .keepTextChuck() (similar to .keepExif(), .keepMetadata()), so conversion to compatible format preserve those, possibly mapping them to similarly named exif metadata if target format has no tEXt chunks

very useful when converting from png to webp for example:

here are some snippets to extract those in case it sparkle some motivation for someone to implement that:

const getWebpMetadataFromFile = (webp: Uint8Array): Maybe<ExifData> => {
    const dataView = new DataView(webp.buffer)

    // Check that the WEBP signature is present
    if (dataView.getUint32(0) !== 0x52494646 || dataView.getUint32(8) !== 0x57454250) {
        console.error('Not a valid WEBP file')
        // r()
        return null
    }

    // Start searching for chunks after the WEBP signature
    let offset = 12
    let txt_chunks: ExifData = {}
    // Loop through the chunks in the WEBP file
    while (offset < webp.length) {
        const chunk_length = dataView.getUint32(offset + 4, true)
        const chunk_type = String.fromCharCode(...webp.slice(offset, offset + 4))
        if (chunk_type === 'EXIF') {
            if (String.fromCharCode(...webp.slice(offset + 8, offset + 8 + 6)) == 'Exif\0\0') {
                offset += 6
            }
            let data = parseExifData(webp.slice(offset + 8, offset + 8 + chunk_length))
            for (var key in data) {
                var value = data[key]
                let index = value.indexOf(':')
                txt_chunks[value.slice(0, index)] = value.slice(index + 1)
            }
        }

        offset += 8 + chunk_length
    }

    return txt_chunks
}
export const getPngMetadataFromUint8Array = (pngData: Uint8Array): Either<string, TextChunks> => {
    const dataView = new DataView(
        pngData.buffer,
        pngData.byteOffset, // <-- it just doesn't work without this
        pngData.byteLength, // <-- it just doesn't work without this
    )

    // console.log('🟢', dataView.getUint32(0))
    // console.log('🟢', dataView)

    // Check that the PNG signature is present
    if (dataView.getUint32(0) !== 0x89504e47) {
        // 🔴 showErrorMessage('Not a valid PNG file')
        return resultFailure('Not a valid PNG file')
    }

    // Start searching for chunks after the PNG signature
    let offset = 8

    let txt_chunks: TextChunks = {}

    // Loop through the chunks in the PNG file
    while (offset < pngData.length) {
        // Get the length of the chunk
        const length = dataView.getUint32(offset)
        // Get the chunk type
        const type = String.fromCharCode(...pngData.slice(offset + 4, offset + 8))
        if (type === 'tEXt') {
            // Get the keyword
            let keyword_end = offset + 8
            while (pngData[keyword_end] !== 0) {
                keyword_end++
            }
            const keyword = String.fromCharCode(...pngData.slice(offset + 8, keyword_end))
            // Get the text
            const contentArraySegment = pngData.slice(keyword_end + 1, offset + 8 + length)
            const contentJson = Array.from(contentArraySegment)
                .map((s) => String.fromCharCode(s))
                .join('')
            txt_chunks[keyword] = contentJson
        }

        offset += 12 + length
    }

    return resultSuccess(txt_chunks)
}
lovell commented 4 days ago

v0.33.5 added support for reading PNG tEXt chunks from an input via #4157. This issue remains open to track the future possible enhancement to control setting/updating PNG tEXt chunks in the output.