exponential-decay / skeleton-test-suite-generator

DROID Skeleton Test Suite Generator (skeleton-test-suite-generator): Tool for the automated generation of digital objects based on the digital signatures documented in the PRONOM database maintained by The National Archives, UK. The skeleton-test-suite-generator serves to fill the gap that exists whereby the community requires a corpus of digital objects for the validation and evaluation of format identification tools and techniques. The tool should be used to complement a methodology whereby skeleton files are also generated manually by signature developers. The tool takes a signature specified for a digital object in PRONOM and constructs a digital object that will match its footprint. For more information, see the README.md associated with the project...
zlib License
7 stars 2 forks source link

Variable sequence output does not honour PRONOM documented offsets, e.g. fmt/161 (SIARD) #2

Closed richardlehane closed 10 years ago

richardlehane commented 10 years ago

Hi Ross Thanks for looking into fmt/189 for me and following up with TNA. Hope you don't mind, but I've got another curly one... The SIARD signature (http://www.nationalarchives.gov.uk/PRONOM/Format/proFormatSearch.aspx?status=detailReport&id=876&strPageToDisplay=signatures) contains a variable sequence beginning "786d". Your signature has that sequence but at offset 14. Unusually for a variable sequence (and perhaps illegally, depending how you read "Technical Paper 1") it has been given an offset of 1024. I'd interpret this to mean a minimum offset of 1024 from the BOF, making your skeleton file invalid.

Thanks in advance! Richard

ross-spencer commented 10 years ago

Hi Richard,

I think that's fixed now. It looks like the remnants of testing. I had left hard coded offsets in the code for writing variable sequences. I now pass the min, max variables as was originally envisioned and made some conservative changes earlier on in the code to make that more robust too. Looking at the code after a year, I think I need to go back and make a few more bits more robust too, and need some unit tests to test the output of the tool - especially for a signature like this one.

Let me know if you need me to output a new test suite and upload it somewhere, or if you're OK as is.

I'll submit a bug report to the DROID team about it ignoring the offset for variably positioned sequences which seems to have been uncovered by this issue.

Many thanks,

Ross

PS. I'd be interested to hear more about how you're using the tool/suite if you have the time. Let me know if you need my work email address.