ffdev-info / signature-development-utility

PRONOM, DROID Signature Development Utility source code.
http://ffdev.info
Apache License 2.0
5 stars 1 forks source link

Sequences not working with DROID #3

Open ross-spencer opened 4 years ago

ross-spencer commented 4 years ago

It looks like there is still a shift between what can be interpreted from a DROID signature file and the DROID container signature file.

There is a related issue here: https://github.com/digital-preservation/droid/issues/237

This XML should work, but it doesn't and it is certainly an unpublished specification from TNA.

<?xml version="1.0" encoding="UTF-8"?>
<FFSignatureFile xmlns="http://www.nationalarchives.gov.uk/pronom/SignatureFile" Version="1" DateCreated="2020-09-19T21:29:09">
 <InternalSignatureCollection>
  <InternalSignature ID="2" Specificity="Specific">
   <ByteSequence Reference="BOFoffset">
    <SubSequence Position="1" MinFragLength="0" SubSeqMinOffset="0" SubSeqMaxOffset="4">
     <Sequence>504B0304</Sequence>
    </SubSequence>
   </ByteSequence>
   <ByteSequence Reference="EOFoffset">
    <SubSequence Position="1" MinFragLength="0" SubSeqMinOffset="0" SubSeqMaxOffset="4">
     <Sequence>504B01{43-65531}504B0506{18-65531}</Sequence>
    </SubSequence>
   </ByteSequence>
  </InternalSignature>
  <InternalSignature ID="3" Specificity="Specific">
   <ByteSequence Reference="BOFoffset">
    <SubSequence Position="1" MinFragLength="0" SubSeqMinOffset="0" SubSeqMaxOffset="0">
     <Sequence>504B0304{26}5B436F6E74656E745F54797065735D2E786D6C20A2*504B0102*504B0506</Sequence>
    </SubSequence>
   </ByteSequence>
  </InternalSignature>
  <InternalSignature ID="4" Specificity="Specific">
   <ByteSequence Reference="BOFoffset">
    <SubSequence Position="1" MinFragLength="0" SubSeqMinOffset="0" SubSeqMaxOffset="0">
     <Sequence>D0CF11E0A1B11AE1{20}FEFF</Sequence>
    </SubSequence>
   </ByteSequence>
  </InternalSignature>
 </InternalSignatureCollection>
 <FileFormatCollection>
  <FileFormat ID="1" Name="Development Signature" PUID="dev/1" Version="1.0" MIMEType="application/octet-stream">
   <Extension>ext</Extension>
  </FileFormat>
  <FileFormat ID="2" Name="ZIP Format" PUID="x-fmt/263" Version="" MIMEType="application/zip">
   <InternalSignatureID>2</InternalSignatureID>
   <Extension>zip</Extension>
  </FileFormat>
  <FileFormat ID="3" Name="Microsoft Office Open XML" PUID=" fmt/189" Version="" MIMEType="application/octet-stream">
   <InternalSignatureID>3</InternalSignatureID>
  </FileFormat>
  <FileFormat ID="4" Name="OLE2 Compound Document Format" PUID=" fmt/111" Version="" MIMEType="application/octet-stream">
   <InternalSignatureID>4</InternalSignatureID>
  </FileFormat>
 </FileFormatCollection>
</FFSignatureFile>

To build the current development signatures in Siegfried we can do the following: ./roy build -droid development-signature-dev-1.xml -noreports -container container-signature-20200918.xml these will work there.

Notes on patterns

ross-spencer commented 4 years ago

This is fixed for now by wiring in the PHP version of the code here which is pretty graceful, but adds complexity we don't need. But it works.

ross-spencer commented 3 years ago

I'll probably need to update this blog.

ross-spencer commented 2 years ago

NB. To be clear, this was always a mistaken view on my part. It may however be good if the signature file syntax accepted by DROID was simplified.

ross-spencer commented 2 years ago

NB. It looks like my syntax may be wrong so the following needs to be tried and tested:

  <InternalSignature ID="3" Specificity="Specific">
   <ByteSequence Reference="BOFoffset" Sequence="04??[01:0C][01:1F]{28}([41:5A]|[61:7A]){10}(43|44|46|4C|4E)"/>
  </InternalSignature>