Open vqt123 opened 1 year ago
here is an image that includes the 'tEXT' data section
the decoded chunk is as follows:
keyword: paramaters text: highly detailed robot head, clockwork, transparent, filled with colorful liquid, expressive, abstract, trending on artstation, by ryohei hase Negative prompt: (bad-image-v2:0.8) Steps: 20, Sampler: DPM++ 2M, CFG scale: 7, Seed: 756648331, Size: 512x512, Model hash: c35782bad8, Model: general_realisticVisionV13, Denoising strength: 0.4, ENSD: 31337, Hires upscale: 2.5, Hires steps: 8, Hires upscaler: 4x-UltraSharp, Discard penultimate sigma: True
"Artificial Intelligence" hyperbole aside, there are existing metadata standards such as EXIF that might be more suitable for such image generators to use, however tEXt
chunk handling is a useful feature for PNG images in its own right.
The tEXt
chunks are currently parsed by libspng/libvips into metadata properties with names of the form png-comment-{index}-{keyword}
, for example png-comment-0-parameters
. The index
exists only within libvips to create unique names, as the PNG spec allows multiple tEXt
chunks with the same keyword
.
I'd suggest something like the following data structure to expose this, both when reading via metadata() or writing via a new property of png().
// PROPOSED API, NOT YET AVAILABLE
comments: [
{ keyword: 'parameters', text: 'highly detailed...' },
{ keyword: 'Copyright', text: 'legal minefield' },
...
]
A PR to expose this in sharp would be welcome, if you're able.
I'd like to wait this.
we should also add .keepTextChuck()
(similar to .keepExif()
, .keepMetadata()
), so conversion to compatible format preserve those, possibly mapping them to similarly named exif metadata if target format has no tEXt chunks
very useful when converting from png
to webp
for example:
here are some snippets to extract those in case it sparkle some motivation for someone to implement that:
const getWebpMetadataFromFile = (webp: Uint8Array): Maybe<ExifData> => {
const dataView = new DataView(webp.buffer)
// Check that the WEBP signature is present
if (dataView.getUint32(0) !== 0x52494646 || dataView.getUint32(8) !== 0x57454250) {
console.error('Not a valid WEBP file')
// r()
return null
}
// Start searching for chunks after the WEBP signature
let offset = 12
let txt_chunks: ExifData = {}
// Loop through the chunks in the WEBP file
while (offset < webp.length) {
const chunk_length = dataView.getUint32(offset + 4, true)
const chunk_type = String.fromCharCode(...webp.slice(offset, offset + 4))
if (chunk_type === 'EXIF') {
if (String.fromCharCode(...webp.slice(offset + 8, offset + 8 + 6)) == 'Exif\0\0') {
offset += 6
}
let data = parseExifData(webp.slice(offset + 8, offset + 8 + chunk_length))
for (var key in data) {
var value = data[key]
let index = value.indexOf(':')
txt_chunks[value.slice(0, index)] = value.slice(index + 1)
}
}
offset += 8 + chunk_length
}
return txt_chunks
}
export const getPngMetadataFromUint8Array = (pngData: Uint8Array): Either<string, TextChunks> => {
const dataView = new DataView(
pngData.buffer,
pngData.byteOffset, // <-- it just doesn't work without this
pngData.byteLength, // <-- it just doesn't work without this
)
// console.log('🟢', dataView.getUint32(0))
// console.log('🟢', dataView)
// Check that the PNG signature is present
if (dataView.getUint32(0) !== 0x89504e47) {
// 🔴 showErrorMessage('Not a valid PNG file')
return resultFailure('Not a valid PNG file')
}
// Start searching for chunks after the PNG signature
let offset = 8
let txt_chunks: TextChunks = {}
// Loop through the chunks in the PNG file
while (offset < pngData.length) {
// Get the length of the chunk
const length = dataView.getUint32(offset)
// Get the chunk type
const type = String.fromCharCode(...pngData.slice(offset + 4, offset + 8))
if (type === 'tEXt') {
// Get the keyword
let keyword_end = offset + 8
while (pngData[keyword_end] !== 0) {
keyword_end++
}
const keyword = String.fromCharCode(...pngData.slice(offset + 8, keyword_end))
// Get the text
const contentArraySegment = pngData.slice(keyword_end + 1, offset + 8 + length)
const contentJson = Array.from(contentArraySegment)
.map((s) => String.fromCharCode(s))
.join('')
txt_chunks[keyword] = contentJson
}
offset += 12 + length
}
return resultSuccess(txt_chunks)
}
v0.33.5 added support for reading PNG tEXt chunks from an input via #4157. This issue remains open to track the future possible enhancement to control setting/updating PNG tEXt chunks in the output.
With the MASSIVE generative AI movement, automatic1111 has become a staple tool for image generation. This tool creates images that include lots of data regarding the configuration of any image in the 'tEXT' chunk of an image's metadata. It does not appear that sharp has a way to read or write to this section of the images data.