MediaArea / MediaInfoLib

Convenient unified display of the most relevant technical and tag data for video and audio files.
https://mediaarea.net/MediaInfo
BSD 2-Clause "Simplified" License
629 stars 171 forks source link

declare parser attribute in mt when parsing ebml/mkv #172

Closed dericed closed 8 years ago

dericed commented 8 years ago

When the FFV1 parser is used we have:

<block offset="720" name="Data" parser="FFV1" size="4530">
              <data offset="720" name="keyframe">Yes</data>
              <data offset="720" name="version">0</data>
              <data offset="720" name="coder_type">0</data>

can we get a parser="EBML" or a parser="Matroska" when this is used.

JeromeMartinez commented 8 years ago

main issue is that we don't have a top level element for the full file, we have immediately block name="EBML", then block name="Segment"...

is it OK to put parser="Matroska" at the "media" element level? I guess the XSD need to be updated.

dericed commented 8 years ago

Could we add one conceptual block above EBML and Segment.

Such as:

<block offset="0" name="EBML Document" parser="Matroska" size="15257">
    <block offset="0" name="EBML" size="47">...</block>
    <block offset="47" name="Segment" size="15210">...</block>
</block>
JeromeMartinez commented 8 years ago

It is not coherent, because the parser parses the bunch of top level elements:

<media ref="x">
    <block offset="0" name="EBML" parser="Matroska" size="47"></block>
    <block offset="47" name="Segment" size="126120"></block>
</media>

Putting it at the first element would mean that the parser parses the EBML element only.

Additionally, for other formats with e.g. raw frames, it would be even more weird:

<media ref="x">
    <block offset="0" name="Frame" parser="MPEG Audio" size="128"></block>
    <block offset="128" name="Frame" size="128"></block>
    <block offset="256" name="Frame" size="128"></block>
    <block offset="384" name="Frame" size="128"></block>
    <block offset="512" name="Frame" size="128"></block>
</media>

Does not make sense.

<media ref="x" parser="Matroska">
    <block offset="0" name="EBML" size="47"></block>
    <block offset="47" name="Segment" size="126120"></block>
</media>

makes more sense IMHO.

dericed commented 8 years ago

What about if mkv was within a larger file? MKV in MKV? Would the parser attribute go to the block parent of that?

JeromeMartinez commented 8 years ago

Note that I should be able to have something coherent with another parser before the first block with main content.

Example: I should be able to do something similar to:

<media ref="x" parser="MPEG Audio">
    <block offset="0" name="ID3v2 frame" parser="ID3v2" size="128"></block>
    <block offset="128" name="Frame" size="128"></block>
    <block offset="256" name="Frame" size="128"></block>
    <block offset="384" name="Frame" size="128"></block>
    <block offset="512" name="Frame" size="128"></block>
</media>

What about if mkv was within a larger file? MKV in MKV? Would the parser attribute go to the block parent of that?

Already handled (similar to FFV1, you see that the FFV1 parser info is at an upper element that the actual FFV1 content).

dericed commented 8 years ago

I think that works, so parser is the same on media and block and defines the parser of the content of that media or block.

JeromeMartinez commented 8 years ago

Implemented, for most of the formats (e.g. AVI, MPEG-4/QuickTime, Matroska...) as it uses another feature. Note that this is the MediaInfo internal name of the parser, so e.g. it is "Matroska" for both Matroska and WebM, "MPEG-4" for both MPEG-4 and QuickTime (with or without ftyp)...

https://github.com/MediaArea/MediaInfoLib/pull/173/commits/4a575fe40415ab425ed76b8391c681a69dbd8949