drewnoakes / metadata-extractor

Extracts Exif, IPTC, XMP, ICC and other metadata from image, video and audio files
Apache License 2.0
2.56k stars 480 forks source link

Some issues when get metadata from heic picture #362

Closed lzzcg closed 5 years ago

lzzcg commented 6 years ago

Hi everyone, I'm testing the function of 'getting metadata from heic picture', and find some issues as below. I'm wondering whether I have some misunderstanding to the function? Thank you in advance :)

  1. in class ItemInfoBox.ItemInfoEntry. the code cannot parse subsequent bytes after eexcute the line itemName = reader.getString(4); after I delete this line, the program works ok.

  2. in class HandlerBox, I got an error when execute the line name = reader.getNullTerminatedString((int)box.size - 32, Charset.defaultCharset()); and after I replace the line with name = reader.getString((int)box.size - 32);, it works fine.

  3. in class ItemLocationBox, I think the structure of the ItemLocationBox should be like below:

public class ItemLocationBox extends FullBox
{
    int indexSize;

    int offsetSize;

    int lengthSize;

    int baseOffsetSize;

    long itemCount;

    ItemLocationInfo[] itemLocationInfos;
}

public class ItemLocationInfo
{
    long itemID;

    int constructionMethod;

    int dataReferenceIndex;

    byte[] baseOffset;

    int extentCount;

    Extent[] extents;
}

4. cannot get the Exif info from the mdat box, I think this feature has not been implemented yet. :) moved to #371

payton commented 6 years ago

Thanks for the feedback! I developed the limited HEIC support given the most recent ISO documentation at the time. When testing, I had a fairly limited selection of photos to use.

Could you provide us with some images that are causing the errors you referenced? Even better if you could add them to the image database https://github.com/drewnoakes/metadata-extractor-images

As far as your 4th point goes, yes it hasn't yet been implemented - What we have so far is very basic support. Always open to PRs if you're interested in adding something new :)

lzzcg commented 5 years ago

Thanks for the feedback! I developed the limited HEIC support given the most recent ISO documentation at the time. When testing, I had a fairly limited selection of photos to use.

Could you provide us with some images that are causing the errors you referenced? Even better if you could add them to the image database https://github.com/drewnoakes/metadata-extractor-images

As far as your 4th point goes, yes it hasn't yet been implemented - What we have so far is very basic support. Always open to PRs if you're interested in adding something new :)

Hi, pleace refer to https://github.com/marx-yu/heif-reader this uses the "mp4parser" lib in https://github.com/sannies/mp4parser I can get the exif bytes by analyze the iloc info, and then get all exif info by the function com.drew.metadata.exif.ExifReader.extract(RandomAccessReader, Metadata)

payton commented 5 years ago

@lzzcg Thank you for the references. I can take a look at how mp4parser is approaching this, but we are avoiding importing any external libraries.

It is very difficult for me to identify these issues if I don't have an image to reproduce it. Do you have permission to release the image you used to get these errors? I have not been able to reproduce them with my own.

payton commented 5 years ago

To address some of your points, @lzzcg:

  1. Looking at the ISO/IEC 14496-12:2015 documentation, the name should be NULL terminated, but it currently just grabs 4 bytes of data. Good catch on this. There are a couple of other fields (optional) that should be NULL terminated, too. Again, I won't be able to implement/test the solution without an image that reproduces the issue.

  2. The charset specified in ISO/IEC 14496-12:2015 states it should be set to UTF-8, so that could very well be the solution.

  3. I believe all is correct with the current implementation. This structure is specified in ISO/IEC 14496-12:2015. The current implementation has all of the fields you refer to, but there is a bit more complexity in versioning, which may produce a couple different fields.

endy1106 commented 5 years ago

https://github.com/endy1106/public/blob/master/IMG_1034.HEIC

lzzcg commented 5 years ago

@payton please refer to the pic in https://github.com/endy1106/public/blob/master/IMG_1034.HEIC

payton commented 5 years ago

I am going to move the fourth item to its own issue for organizational purposes (it is more of a feature request).

drewnoakes commented 5 years ago

@endy1106 is the linked image (IMG_1034.HEIC) yours, and if so do you grant permission for this project to use it in the sample image repository? We don't currently have any HEIC images for testing.

drewnoakes commented 5 years ago

Pinging @endy1106 about the image once more. If you can provide that or another HEIC image it'd be very helpful.