drewnoakes / xmp-core-dotnet

.NET library for working with the Extensible Metadata Platform (XMP)
60 stars 22 forks source link

Add UTF-32 support (or improve error messaging?) #51

Closed madelson closed 1 year ago

madelson commented 1 year ago

Hello,

I'm trying to use the library to extract metadata from an mp4 file taken with an Android phone. I'm running:

using var stream = File.OpenRead("C:\...\movie.mp4");
var xmp = XmpMetaFactory.Parse(stream);

This fails with:

System.NotSupportedException: UTF-32 is not a supported encoding.
   at XmpCore.Impl.ByteBuffer.GetEncoding()
   at XmpCore.Impl.Latin1Converter.Convert(ByteBuffer buffer)
   at XmpCore.Impl.XmpMetaParser.ParseXmlFromByteBuffer(ByteBuffer buffer, ParseOptions options)
   at XmpCore.Impl.XmpMetaParser.ParseXmlFromInputStream(Stream stream, ParseOptions options)
   at XmpCore.Impl.XmpMetaParser.Parse(Stream stream, ParseOptions options)
   at XmpCore.XmpMetaFactory.Parse(Stream stream, ParseOptions options)

Looking at the code, I saw that this was following the AcceptLatin1 path so I also tried:

var xmp = XmpMetaFactory.Parse(stream, new() { AcceptLatin1 = false });

This fails with:

XmpCore.XmpException: Unsupported Encoding
 ---> XmpCore.XmpException: XML parsing failure
 ---> System.Xml.XmlException: '.', hexadecimal value 0x00, is an invalid character. Line 1, position 1.
   at System.Xml.XmlTextReaderImpl.Throw(Exception e)
   at System.Xml.XmlTextReaderImpl.Throw(String res, String[] args)
   at System.Xml.XmlTextReaderImpl.ParseRootLevelWhitespace()
   at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
   at System.Xml.XmlTextReaderImpl.Read()
   at System.Xml.Linq.XDocument.Load(XmlReader reader, LoadOptions options)
   at System.Xml.Linq.XDocument.Load(XmlReader reader)
   at XmpCore.Impl.XmpMetaParser.ParseStream(Stream stream, ParseOptions options)
   --- End of inner exception stack trace ---
   at XmpCore.Impl.XmpMetaParser.ParseStream(Stream stream, ParseOptions options)
   at XmpCore.Impl.XmpMetaParser.ParseXmlFromByteBuffer(ByteBuffer buffer, ParseOptions options)
   --- End of inner exception stack trace ---
   at XmpCore.Impl.XmpMetaParser.ParseXmlFromByteBuffer(ByteBuffer buffer, ParseOptions options)
   at XmpCore.Impl.XmpMetaParser.ParseXmlFromInputStream(Stream stream, ParseOptions options)
   at XmpCore.Impl.XmpMetaParser.Parse(Stream stream, ParseOptions options)
   at XmpCore.XmpMetaFactory.Parse(Stream stream, ParseOptions options)

Is there a reason why UTF-32 is not supported? Or is the real issue that the file lacks XMP metadata and this is just a misleading error message?

drewnoakes commented 1 year ago

This library cannot read an MP4 for directly. The API you're showing expects to receive the XMP subset of the data, if it exists, directly. Something else needs to find that subset.

The MetadataExtractor library is designed to do just that, and wraps XmpCore.

madelson commented 1 year ago

@drewnoakes MetadataExtractor works like a charm. Thanks!