drewnoakes / metadata-extractor-dotnet

Extracts Exif, IPTC, XMP, ICC and other metadata from image, video and audio files
Other
922 stars 164 forks source link

Reading UTF-8 from iTXt PNGChunk #328

Closed RupertAvery closed 1 year ago

RupertAvery commented 1 year ago

In PNGMetadataReader.cs mentions:

    /// Note that "iTXt" chunks use UTF-8 encoding (https://www.w3.org/TR/PNG/#11iTXt).

And the document at that URL states:

The translated keyword and text both use the UTF-8 encoding

However at L:249, the keyword is read using _latin1Encoding, and the text is read in ReadTextDirectory using _latin1Encoding.

This leads to UTF-8 characters such as emoji being decoded improperly.

RupertAvery commented 1 year ago

Thanks!