exponential-decay / skeleton-test-suite-generator

DROID Skeleton Test Suite Generator (skeleton-test-suite-generator): Tool for the automated generation of digital objects based on the digital signatures documented in the PRONOM database maintained by The National Archives, UK. The skeleton-test-suite-generator serves to fill the gap that exists whereby the community requires a corpus of digital objects for the validation and evaluation of format identification tools and techniques. The tool should be used to complement a methodology whereby skeleton files are also generated manually by signature developers. The tool takes a signature specified for a digital object in PRONOM and constructs a digital object that will match its footprint. For more information, see the README.md associated with the project...
zlib License
7 stars 2 forks source link

Implement Bitmask [&xx] [!&xx] syntax post DROID 5.x #9

Open ross-spencer opened 6 years ago

ross-spencer commented 6 years ago

See: https://github.com/openpreserve/fido/issues/117

This will work for identifying HFS if we want to try it out:

 BOF Offset 1024: 4244{12}0003{6}[!&01]00
ross-spencer commented 6 years ago

Notes:

Documented a little bit here: https://groups.google.com/forum/#!msg/droid-list/v4CHVddELaM/IhmBcN0Vk_oJ

Not implemented in standard skeleton suite (problematic for testing (need to mock a PRONOM xml to implement and test (HFS is a good candiate)): https://github.com/exponential-decay/skeleton-test-suite-generator

Implemented in container skeleton suite (should be some copy and paste code here we can use): exponential-decay/skeleton-container-test-suite-generator#1

ross-spencer commented 6 years ago

@Dclipsham how does the PRONOM stored procedure deal with the bitmask signatures - have we used them outside of the container signature file yet?

Dclipsham commented 6 years ago

Hi Ross, To my knowledge we haven't previously, but I've tried it in DevTest and it handled it happily. Output from sig file is as follows: `

4244 3 2 1 0003 [!&01]00
    </InternalSignature>`

`

1509
        <Extension>img</Extension>
    </FileFormat>`

This identifies the HFS files available on the OPF Corpus. Is this enough information or do you need something further from the routine itself?

David

ross-spencer commented 6 years ago

Hi @Dclipsham

That’s perfect. Shouldn’t need anything else, it was more a ‘before I implement this I should see if PRONOM is going to be able to do it’ thing. Otherwise it might mitigate the effort spent.

I should be able to test this with the sig-dev utility and the sample files 👍

Thanks!

On 8 Nov 2017, at 04:25, David Clipsham notifications@github.com wrote:

Hi Ross, To my knowledge we haven't previously, but I've tried it in DevTest and it handled it happily. Output from sig file is as follows:

4244 3 2 1 0003 [!&01]00 1509 img

This identifies the HFS files available on the OPF Corpus. Is this enough information or do you need something further from the routine itself?

David

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

richardlehane commented 6 years ago

@Dclipsham if bitmasked signatures enter the PRONOM canon would be great if at some point Technical Paper 1 could be updated to reflect. That document was a great resource for me & I'm sure for others - is really nice to have an exhaustive reference doc rather than rely on reverse engineering/ trial and error (though there is always going to be a bit of that too!)

ross-spencer commented 6 years ago

+1 Richard.

The technical paper is what I used to create the sigdev utility along with the patterns seen in PRONOM/DROID files. All the syntax documentation on the repository for that is directly from there.

Maybe the droid wiki could be opened up to something like this on GitHub? - I would be happy to contribute.

On Wed, Nov 8, 2017 at 11:00 AM, Richard Lehane notifications@github.com wrote:

@Dclipsham https://github.com/dclipsham if bitmasked signatures enter the PRONOM canon would be great if at some point Technical Paper 1 https://www.nationalarchives.gov.uk/aboutapps/fileformat/pdf/automatic_format_identification.pdf could be updated to reflect. That document was a great resource for me & I'm sure for others - is really nice to have an exhaustive reference doc rather than rely on reverse engineering/ trial and error (though there is always going to be a bit of that too!)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/exponential-decay/skeleton-test-suite-generator/issues/9#issuecomment-342636846, or mute the thread https://github.com/notifications/unsubscribe-auth/AByxXDEFjzUalnCTUcXBt6MQK8IKFpFTks5s0NL3gaJpZM4QSoSJ .