KBNLresearch / omSipCreator

Create ingest-ready SIPs from batches of optical media images
Apache License 2.0
7 stars 0 forks source link

Add info from Isobuster / dBpoweramp log files as event metadata #27

Open bitsgalore opened 7 years ago

bitsgalore commented 7 years ago

From the PREMIS Data Dictionary:

http://www.loc.gov/standards/premis/v3/premis-3-0-final.pdf

Page 16:

All events have outcomes (success, failure, etc.). Some events also have outputs; for example, the execution of a program creates a new file object. The semantic units eventOutcome and eventOutcomeDetail are intended for documenting qualitative outcomes. For example, if the event is an act of format validation, the value of eventOutcome might be a code indicating the object is fully valid. Alternatively, it might be a code indicating the object is not fully valid, and eventOutcomeDetail could be used to describe all anomalies found. If the program performing the validation writes a log of warnings and error messages, a second instance of eventOutcomeDetail could be used to store or point to that log. If an event creates objects that are stored in the repository, those objects should be described as entities with a complete set of applicable metadata and associated with the event by links. Some additional aspects of an event other than its outcomes or outputs might be recorded, such as the specific parameters used during a migration event, the nature of the operation (automated, manual or semi-automated) and so on. Such information can be recorded in eventDetail.

So detailed log contents should probably go to eventOutcomeDetail and/or eventDetail.

Isobuster log is only a status code. dBPoweramp extraction is very elaborate. Example:

dBpoweramp Release 16.1 Digital Audio Extraction Log from 29 March 2017 14:26

Drive & Settings
----------------

Ripping with drive 'I:   [TEAC     - DV-W5600S       ]',  Drive offset: 48,  Overread Lead-in/out: No
AccurateRip: Active,  Using C2: No,  Cache: 1024 KB,  FUA Cache Invalidate: No
Pass 1 Drive Speed: Max,  Pass 2 Drive Speed: Max
Ultra::  Vary Drive Speed: No,  Min Passes: 2,  Max Passes: 4,  Finish After Clean Passes: 2
Bad Sector Re-rip::  Drive Speed: Max,  Maximum Re-reads: 34

Encoder: FLAC -compression-level-5
DSP Effects / Actions: -dspeffect1="ReplayGain= -r128lufs={qt}-18{qt}"

Extraction Log
--------------

Track 1:  Ripped LBA 0 to 873 (0:11) in 0:06. Filename: E:\testiromlab\kb-c91bbfda-147a-11e7-93d2-00237d497a29\d03ae636-147a-11e7-a687-00237d497a29\01._
  Secure  [Pass 1 & 2, Ultra 1 to 2]
  CRC32: 94B94128     AccurateRip CRC: 205C83DB (CRCv2)     [DiscID: 007-00008d92-00036343-33008007-1]

Track 2:  Ripped LBA 873 to 2220 (0:17) in 0:09. Filename: E:\testiromlab\kb-c91bbfda-147a-11e7-93d2-00237d497a29\d03ae636-147a-11e7-a687-00237d497a29\02._
  Secure  [Pass 1 & 2, Ultra 1 to 2]
  CRC32: 8D063503     AccurateRip CRC: 6111D125 (CRCv2)     [DiscID: 007-00008d92-00036343-33008007-2]

Track 3:  Ripped LBA 2220 to 3560 (0:17) in 0:09. Filename: E:\testiromlab\kb-c91bbfda-147a-11e7-93d2-00237d497a29\d03ae636-147a-11e7-a687-00237d497a29\03._
  Secure  [Pass 1 & 2, Ultra 1 to 2]
  CRC32: 5A8063B5     AccurateRip CRC: AB5245D6 (CRCv2)     [DiscID: 007-00008d92-00036343-33008007-3]

Track 4:  Ripped LBA 3560 to 5378 (0:24) in 0:12. Filename: E:\testiromlab\kb-c91bbfda-147a-11e7-93d2-00237d497a29\d03ae636-147a-11e7-a687-00237d497a29\04._
  Secure  [Pass 1 & 2, Ultra 1 to 2]
  CRC32: FA04EE3B     AccurateRip CRC: 091E59E9 (CRCv2)     [DiscID: 007-00008d92-00036343-33008007-4]

Track 5:  Ripped LBA 5378 to 6602 (0:16) in 0:08. Filename: E:\testiromlab\kb-c91bbfda-147a-11e7-93d2-00237d497a29\d03ae636-147a-11e7-a687-00237d497a29\05._
  Secure  [Pass 1 & 2, Ultra 1 to 2]
  CRC32: DB6AC0CF     AccurateRip CRC: B53D97B9 (CRCv2)     [DiscID: 007-00008d92-00036343-33008007-5]

Track 6:  Ripped LBA 6602 to 8002 (0:18) in 0:09. Filename: E:\testiromlab\kb-c91bbfda-147a-11e7-93d2-00237d497a29\d03ae636-147a-11e7-a687-00237d497a29\06._
  Secure  [Pass 1 & 2, Ultra 1 to 2]
  CRC32: 9C9988B7     AccurateRip CRC: 30EC37D6 (CRCv2)     [DiscID: 007-00008d92-00036343-33008007-6]

Track 7:  Ripped LBA 8002 to 9607 (0:21) in 0:10. Filename: E:\testiromlab\kb-c91bbfda-147a-11e7-93d2-00237d497a29\d03ae636-147a-11e7-a687-00237d497a29\07._
  Secure  [Pass 1 & 2, Ultra 1 to 2]
  CRC32: A4EBAB27     AccurateRip CRC: BA810440 (CRCv2)     [DiscID: 007-00008d92-00036343-33008007-7]

--------------

7 Tracks Ripped Securely

Perhaps split 'Drive & settings' and 'Extraction log'to 2 separate elements.

PREMIS in METS examples:

https://www.loc.gov/standards/premis/examples.html

LoC Guidelines for PREMIS in METS:

https://www.loc.gov/standards/premis/guidelines2017-premismets.pdf

From BnF audio example:

<digiprovMD ID="AMD.2">
    <mdWrap MIMETYPE="text/xml" MDTYPE="PREMIS:EVENT">
        <xmlData>
            <premis:event>
                <premis:eventIdentifier>
                    <premis:eventIdentifierType>UUID</premis:eventIdentifierType>
                    <premis:eventIdentifierValue>2bb30520-bd2f-11e0-96c1-00144f68e7e0</premis:eventIdentifierValue>
                </premis:eventIdentifier>
                <premis:eventType>packageCreation</premis:eventType>
                <premis:eventDateTime>2011-08-02T19:45:11.542+02:00</premis:eventDateTime>
                <premis:eventDetail>Création d'un paquet compatible avec SPAR</premis:eventDetail>
                <premis:linkingAgentIdentifier>
                    <premis:linkingAgentIdentifierType>BnFApplication</premis:linkingAgentIdentifierType>
                    <premis:linkingAgentIdentifierValue>info:bnf/spar/agent/preingest_fil_aud_b_act_218</premis:linkingAgentIdentifierValue>
                    <premis:linkingAgentRole>performer</premis:linkingAgentRole>
                </premis:linkingAgentIdentifier>
                <premis:linkingAgentIdentifier>
                    <premis:linkingAgentIdentifierType>producerIdentifier</premis:linkingAgentIdentifierType>
                    <premis:linkingAgentIdentifierValue>AUD</premis:linkingAgentIdentifierValue>
                    <premis:linkingAgentRole>issuer</premis:linkingAgentRole>
                </premis:linkingAgentIdentifier>
                <premis:linkingAgentIdentifier>
                    <premis:linkingAgentIdentifierType>channelIdentifier</premis:linkingAgentIdentifierType>
                    <premis:linkingAgentIdentifierValue>info:bnf/spar/context/fil_aud_b</premis:linkingAgentIdentifierValue>
                    <premis:linkingAgentRole>authorizer</premis:linkingAgentRole>
                </premis:linkingAgentIdentifier>
                <premis:linkingObjectIdentifier>
                    <premis:linkingObjectIdentifierType>productionIdentifier</premis:linkingObjectIdentifierType>
                    <premis:linkingObjectIdentifierValue>157941</premis:linkingObjectIdentifierValue>
                </premis:linkingObjectIdentifier>
            </premis:event>
        </xmlData>
    </mdWrap>
</digiprovMD>
kieranjol commented 7 years ago

Interested in the outcome of this. I've mostly been putting custom strings into eventDetail, but have been interested in seeing logs added. I guess eventDetailExtension is more for machine Readable xml logs? Will have to check.

bitsgalore commented 7 years ago

Content of logs written to eventOutcomeDetailNote field in https://github.com/KBNLresearch/omSipCreator/commit/7ab36f88b28715b7a9814efce82a7c1ae65d46e2

TODO: this looks OK for the (very detailed) dBpoweramp logs, but less so for the Isobuster logs which only contain an error code, which is something one would normally write to eventOutcome.

bitsgalore commented 6 years ago

UPDATE: the IsoBuster DFXML report (now stored as file-level techMD section) also contains some fields that are really event metadata. E.g.:

 <dfxml:creator>
 <dfxml:program>IsoBuster</dfxml:program>
 <dfxml:version>4.1.0.14</dfxml:version>
 <dfxml:execution_environment>
 <dfxml:start_time>2018-02-06T11:12:15</dfxml:start_time><!--GMT-->
 <dfxml:os_version>Windows 7 (2.6.1.7601)</dfxml:os_version>
 <dfxml:username>jkn010</dfxml:username>
 </dfxml:execution_environment>
 </dfxml:creator>

 <dfxml:source>
 <dfxml:device_model>TEAC  DV-W5600S</dfxml:device_model>
 <dfxml:image_filename/>
 <dfxml:image_size/>
 <dfxml:sectorsize>2048</dfxml:sectorsize>
 <dfxml:devicesectors coding="base10">110225</dfxml:devicesectors>
 </dfxml:source>

So maybe extract/copy these fields over to PREMIS event in digiprovMD?