drewnoakes / metadata-extractor

Extracts Exif, IPTC, XMP, ICC and other metadata from image, video and audio files
Apache License 2.0
2.54k stars 475 forks source link

Support Adobe InDesign #98

Open fsforna opened 9 years ago

fsforna commented 9 years ago

Dear Drewnoakes, could you tell me if the library can manage application/x-adobe-indesign file format? Are there the possibility to extract application/x-adobe-indesign metadata information?

Regards, Francesco.

drewnoakes commented 9 years ago

I'm not familiar with the format. Do you know anything about it's encoding? What metadata would you expect?

fsforna commented 9 years ago

Hi Drew, thank you for your quick reply. I don't konw the file encoding. It should be managed by using the XMLFiles native library from the SDK Adobe XMP Toolkit but it doesn't exist a java library that wraps XMLFiles functions.

https://forums.adobe.com/thread/939540?start=0&tstart=0

drewnoakes commented 9 years ago

I'm not sure when I'd have time to look at this. It'd be great if you could put a pull request together.

drewnoakes commented 9 years ago

The specification for IDML files is here:

https://www.adobe.com/content/dam/Adobe/en/devnet/indesign/cs55-docs/IDML/idml-specification.pdf

drewnoakes commented 7 years ago

InDesign files appear to be ZIP files. The current approach to detecting file types involves looking at the first (magic) bytes of the file. This would identify IDML files as ZIPs. Further clarification is needed by inspecting the MIME type file within the ZIP.

What might be necessary is to differentiate between container formats (eg: TIFF, ZIP, JPEG, RIFF, ...) and more specific usages of those containers (eg: CRW, PNG, MP4, IDML, ...). See also #217.

Within the ZIP file it seems there is an XML file that contains XMP. There may be other kinds of metadata too, but the XMP would be a good start.