Closed gleporeNARA closed 1 year ago
Hi Greg I suspect roy is right to reject this signature. I checked the DROID signatures file and there are no signatures with empty sequence elements. Did you try loading this extension file in DROID, does it work for it?
Good news is there are other ways to achieve the outcome you want. I think the most idiomatic is just to define two "Internal Signatures" for the format. There are many of these in PRONOM e.g. fmt/1458
I'm not sure if Ross's tool allows you to make a signature like this but hand crafted it would look like:
Hey! There's a corresponding issue for Ross' tool (https://github.com/exponential-decay/signature-development-utility/issues/26) and I agree the manual method works. I don't quite understand the error however. Can you explain the "empty sequence elements" error? It seems a simple either/or construction to me.
On my part Greg the translation from an input signature to an output signature is a pretty naive translation of the algorithm defined by PRONOM back in the day, to code. We don't do any validation etc. Here, it's just creating the sequences for an option sub-sequence and outputting it. In reality though, you need an anchor ANCHORBYTES
shown as a placeholder below.
<ByteSequence Reference="BOFoffset">
<SubSequence MinFragLength="0" Position="1" SubSeqMaxOffset="0" SubSeqMinOffset="0">
<Sequence>ANCHORBYTES</Sequence>
<DefaultShift>2</DefaultShift>
<Shift Byte="00">1</Shift>
<RightFragment MaxOffset="0" MinOffset="0" Position="1">58464952</RightFragment>
<RightFragment MaxOffset="0" MinOffset="0" Position="1">52494658</RightFragment>
</SubSequence>
</ByteSequence>
vs.
<ByteSequence Reference="BOFoffset">
<SubSequence MinFragLength="0" Position="1" SubSeqMaxOffset="0" SubSeqMinOffset="0">
<Sequence/> <!--- this is your empty sequence I believe -->
<DefaultShift>1</DefaultShift>
<RightFragment MaxOffset="0" MinOffset="0" Position="1">58464952</RightFragment>
<RightFragment MaxOffset="0" MinOffset="0" Position="1">52494658</RightFragment>
</SubSequence>
</ByteSequence>
So we end up with <Sequence/>
in the faulty one vs. <Sequence>ANCHORBYTES</Sequence>
. So yeah, without something at the true beginning of file, it starts to look like two signatures.
I hope I can make that easier for you to define soon. I mentioned chatting to David, but also, I think about workarounds.
I think it's really cool you're thinking of signatures like this btw. It's an intuitive idea to me.
Closing this one for now as appears to be a syntax/PRONOM issue
When attempting to use a PRONOM signature file that contains (58464952|52494658) as the signature (created with Ross' tool) I get a roy crash:
sudo roy build -extend Generic-RIFX-Container-1.0-signature-file.xml rifxgen.sig 2022/05/10 12:00:15 parse error dev/1: empty sequence
The above signature would match either RIFX or XFIR at the beginning of a file.
Not sure if this is an issue with roy or with ffdev.info, but having the ability to match multiple start sequences would be useful, especially for formats with both big and little endianess.
Signature attached.
RIFX-big-and-little-1.0-signature-file.zip