drewnoakes / metadata-extractor-dotnet

Extracts Exif, IPTC, XMP, ICC and other metadata from image, video and audio files
Other
922 stars 164 forks source link

Inaccurate Meta Data for m4a media format #337

Open xianglongcheng opened 1 year ago

xianglongcheng commented 1 year ago

It seems it will parse the metadata of m4a(audio format) as mp4 video format. I'm using below code to work around this, but if this could be fixed it would be pefect. Thanks very much!

            IEnumerable<MetadataExtractor.Directory> directories = ImageMetadataReader.ReadMetadata(filefullname);

            foreach (MetadataExtractor.Directory d in directories)
            {
                if (d is FileTypeDirectory)
                {
                    var ft = d as FileTypeDirectory;
                    if (ft != null)
                        foreach (var tag in ft.Tags)
                        {
                            if (tag.Name == "Expected File Name Extension")
                            {
                                if (tag.Description == "mp4")
                                    ext = "m4a";//there are only audio files, so use m4a instead of mp4
                                else if (tag.Description == "mp3")
                                    ext = "mp3";
                                suc = true;
                                return ext;
                            }
                        }
                }
            }
drewnoakes commented 1 year ago

Thanks! Can you provide a small sample file that reproduces the problem please?

xianglongcheng commented 1 year ago

Please check attached file. It shows below information with my code. Thanks!

At 2023-08-03 06:56:01, "Drew Noakes" @.***> wrote:

Thanks! Can you provide a small sample file that reproduces the problem please?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

drewnoakes commented 1 year ago

Thanks but unfortunately it seems files attached to emails are not added to the issue. You'll need to attach it via the web UI.

xianglongcheng commented 1 year ago

ring.m4a.pdf Remove the pdf extension after download. Please try.

drewnoakes commented 1 year ago

@xianglongcheng here's the output I get for your file:

FILE: Issue 337 (dotnet).m4a
TYPE: MP4

[QuickTime File Type - 0x0001] Major Brand = M4A 
[QuickTime File Type - 0x0002] Minor Version = 512
[QuickTime File Type - 0x0003] Compatible Brands = M4A, isom, iso2

[QuickTime Movie Header - 0x0001] Version = 0
[QuickTime Movie Header - 0x0002] Flags = 0 0 0
[QuickTime Movie Header - 0x0003] Created = Fri Jan 01 00:00:00 1904
[QuickTime Movie Header - 0x0004] Modified = Fri Jan 01 00:00:00 1904
[QuickTime Movie Header - 0x0005] TrackId = 1000
[QuickTime Movie Header - 0x0006] Duration = 00:00:11.2880000
[QuickTime Movie Header - 0x0007] Preferred Rate = 1
[QuickTime Movie Header - 0x0008] Preferred Volume = 1
[QuickTime Movie Header - 0x0009] Matrix = [36 values]
[QuickTime Movie Header - 0x000a] Preview Time = 0
[QuickTime Movie Header - 0x000b] Preview Duration = 0
[QuickTime Movie Header - 0x000c] Poster Time = 0
[QuickTime Movie Header - 0x000d] Selection Time = 0
[QuickTime Movie Header - 0x000e] Selection Duration = 0
[QuickTime Movie Header - 0x000f] Current Time = 0
[QuickTime Movie Header - 0x0010] Next Track Id = 2

[QuickTime Track Header - 0x0001] Version = 0
[QuickTime Track Header - 0x0002] Flags = 0 0 3
[QuickTime Track Header - 0x0003] Created = Fri Jan 01 00:00:00 1904
[QuickTime Track Header - 0x0004] Modified = Fri Jan 01 00:00:00 1904
[QuickTime Track Header - 0x0005] TrackId = 1
[QuickTime Track Header - 0x0006] Duration = 11288
[QuickTime Track Header - 0x0007] Layer = 0
[QuickTime Track Header - 0x0008] Alternate Group = 1
[QuickTime Track Header - 0x0009] Volume = 1
[QuickTime Track Header - 0x000c] Matrix = 1 0 0 0 1 0 0 0 1
[QuickTime Track Header - 0x000a] Width = 0
[QuickTime Track Header - 0x000b] Height = 0

[File Type - 0x0001] Detected File Type Name = MP4
[File Type - 0x0002] Detected File Type Long Name = MPEG-4 Part 14
[File Type - 0x0003] Detected MIME Type = video/mp4
[File Type - 0x0004] Expected File Name Extension = mp4

[File - 0x0001] File Name = Issue 337 (dotnet).m4a
[File - 0x0002] File Size = 496318 bytes
[File - 0x0003] File Modified Date = <omitted for regression testing as checkout dependent>

- QuickTime File Type
- QuickTime Movie Header
- QuickTime Track Header
- File Type
- File

Is the issue you're describing with this field?

[File Type - 0x0004] Expected File Name Extension = mp4

The file here is actually an MP4 file. The M4A extension exists to help identify audio-only MP4 files.

How would you suggest we identify the expected file type extension in this case? That directory is currently only populated using information from the header of the file (magic bytes at the start).

xianglongcheng commented 1 year ago

From the output, it already shows "[QuickTime File Type - 0x0001] Major Brand = M4A", so I think it would be more accurate to show the expected extension as "m4a".

As I'm not familiar with the mp4 file format, the request might cause too much coding or even possible issue. If that's the case, we can just keep current output which is already great and easy to read.

Thanks!

drewnoakes commented 1 year ago

The library populates the file type directory based only on the header. Later, the file is identified as M4A once the actual M data is parsed, but we don't go back and modify the other directory.

Today, you could use the tag you identified to understand when you have an M4A and update accordingly.

Longer term this would be a candidate for another issue we have that tracks consolidating data across multiple tags (which I can't find easily now in my phone).

xianglongcheng commented 12 months ago

Sure. Thanks!

rjgotten commented 9 months ago

FWIW; the MIME content type here is wrong either way. It lists video/mp4 but MP4 audio-only files specifically should have audio/mp4 as their content type. https://www.iana.org/assignments/media-types/media-types.xhtml#audio https://www.iana.org/assignments/media-types/audio/mp4

jruggiero1955 commented 9 months ago

I believe that the [QuickTime Movie Header - 0x0005] TrackId is mislabeled. According to exiftools, that value corresponds to the Media Time Scale.

The [QuickTime Track Header - 0x0005] TrackId value is correct