samvera / hydra-works

A ruby gem implementation of the PCDM Works domain model based on the Samvera software stack
Other
24 stars 14 forks source link

No Metadata for Video (FITS MediaInfo tool issue) #331

Closed conorom closed 6 years ago

conorom commented 7 years ago

Out of the box, FITS now uses MediaInfo to produce the metadata for video, which separates metadata out into separate a/v "track" nodes in the XML. This means results won't be parsed correctly by the code here, which expects the format produced by ExifTool. I think this started in FITS 0.8.10

I guess either the parsing should be changed to handle MediaInfo's track-based output, or else steps should be documented on how to change fits.xml to get FITS to work with Samvera components. Perhaps by linking to this issue from Hyrax and hydra-derivatives READMEs, which (to my knowledge) are the two main places where recommendations on a FITS version are made.

note: Most likely this affects all the formats listed for MediaInfo processing in fits.xml, i.e.: include-exts="avi,mov,mpg,mpeg,mkv,mp4,mxf,ogv,mj2,divx,dv,m4v,m2v,ismv,m2ts,mpeg4" ...however all my experience is with mp4, so the workaround mentioned here will focus on mp4.

The easy workaround is to exclude mp4 (and/or whatever other formats) from processing by MediaInfo, thus reverting to ExifTool which will produce the output expected by hydra-works.

FITS 1.0.5 is listed as "known to be good" in the Hyrax and hydra-derivatives documentation right now, so the steps below use FITS 1.0.5, but similar changes need to be made to get any version of FITS above 0.8.10 to work correctly with mp4s in the Samvera stack.

First, a sample of MediaInfo metadata output for a mp4 file, i.e. FITS 1.0.5 out of the box:

<metadata>
  <video>
    <location toolname="MediaInfo" toolversion="0.7.75" status="SINGLE_RESULT">/path/to/blah.mp4</location>
    <mimeType toolname="MediaInfo" toolversion="0.7.75" status="SINGLE_RESULT">video/quicktime</mimeType>
    <format toolname="MediaInfo" toolversion="0.7.75" status="SINGLE_RESULT">Quicktime</format>
    <formatProfile toolname="MediaInfo" toolversion="0.7.75" status="SINGLE_RESULT">Base Media</formatProfile>
    <duration toolname="MediaInfo" toolversion="0.7.75" status="SINGLE_RESULT">107230</duration>
    <bitRate toolname="MediaInfo" toolversion="0.7.75" status="SINGLE_RESULT">655784</bitRate>
    <dateCreated toolname="MediaInfo" toolversion="0.7.75" status="SINGLE_RESULT">UTC 1904-01-01 00:00:00</dateCreated>
    <dateModified toolname="MediaInfo" toolversion="0.7.75" status="SINGLE_RESULT">UTC 2017-05-01 12:07:45</dateModified>
    <track type="video" id="1" toolname="MediaInfo" toolversion="0.7.75" status="SINGLE_RESULT">
      <videoDataEncoding>avc1</videoDataEncoding>
      <codecId>avc1</codecId>
      <codecCC>avc1</codecCC>
      <codecVersion>Main@L3</codecVersion>
      <codecName>AVC</codecName>
      <codecFamily>H.264</codecFamily>
      <codecInfo>Advanced Video Codec</codecInfo>
      <compression>Unknown</compression>
      <byteOrder>Unknown</byteOrder>
      <bitDepth>8 bits</bitDepth>
      <bitRate>523396</bitRate>
      <duration>107100</duration>
      <trackSize>7006967</trackSize>
      <width>640 pixels</width>
      <height>480 pixels</height>
      <frameRate>30.000</frameRate>
      <frameRateMode>Constant</frameRateMode>
      <frameCount>3213</frameCount>
      <aspectRatio>4:3</aspectRatio>
      <scanningFormat>Progressive</scanningFormat>
      <chromaSubsampling>4:2:0</chromaSubsampling>
      <colorspace>YUV</colorspace>
    </track>
    <track type="audio" id="2" toolname="MediaInfo" toolversion="0.7.75" status="SINGLE_RESULT">
      <audioDataEncoding>AAC</audioDataEncoding>
      <codecId>40</codecId>
      <codecFamily>AAC</codecFamily>
      <compression>Lossy</compression>
      <bitRate>125597</bitRate>
      <bitRateMode>Constant</bitRateMode>
      <duration>107230</duration>
      <trackSize>1683467</trackSize>
      <soundField>Front: L R</soundField>
      <samplingRate>44100</samplingRate>
      <numSamples>4728843</numSamples>
      <channels>2</channels>
    </track>
  </video>
</metadata>

Now, steps to get the expected ExifTool output back. Edit xml/fits.xml:

  1. remove mp4 from include-exts for MediaInfo
  2. remove mp4 from exclude-exts for ExifTool
  3. add mp4 to exclude-exts for FileUtility

note: Without step 3, all recent FITS versions revert to using FileUtility, which results in empty metadata.

After these changes, the expected output is back:

<metadata>
  <video>
    <duration toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">0:06:20</duration>
    <frameRate toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">24</frameRate>
    <bitDepth toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">24</bitDepth>
    <audioSampleRate toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">44100</audioSampleRate>
    <channels toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">2</channels>
    <imageWidth toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">1920</imageWidth>
    <imageHeight toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">1080</imageHeight>
    <rotation toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">0</rotation>
    <xSamplingFrequency toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">72</xSamplingFrequency>
    <ySamplingFrequency toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">72</ySamplingFrequency>
    <creatingApplicationName toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">avc1</creatingApplicationName>
  </video>
</metadata>
conorom commented 7 years ago

Did a quick test on a mov format video and the additional step of excluding processing by Droid was required to get ExifTool output back. Fortunately I'm only interested in mp4 videos right now. Definitely seems like changing the characterization parsing to look for MediaInfo values per track is better than playing whack-a-mole with FITS configuration in the long run.