exponential-decay / skeleton-test-suite-generator

DROID Skeleton Test Suite Generator (skeleton-test-suite-generator): Tool for the automated generation of digital objects based on the digital signatures documented in the PRONOM database maintained by The National Archives, UK. The skeleton-test-suite-generator serves to fill the gap that exists whereby the community requires a corpus of digital objects for the validation and evaluation of format identification tools and techniques. The tool should be used to complement a methodology whereby skeleton files are also generated manually by signature developers. The tool takes a signature specified for a digital object in PRONOM and constructs a digital object that will match its footprint. For more information, see the README.md associated with the project...
zlib License
7 stars 2 forks source link

fmt/1157 #11

Open richardlehane opened 5 years ago

richardlehane commented 5 years ago

The new Folio Infobase File has overlapping beginning-of-file sequences:

image

Only the "Folio" bit is being generated in the test signature.

ross-spencer commented 5 years ago

Is this an error with PRONOM do you think @richardlehane?

richardlehane commented 5 years ago

it is a bit of an odd one, but I think intended: i.e. looks for two sequences "Folio" and "v200" which both need to be near the start of the file but can be in any order

jcharlet commented 4 years ago

confirmed with@Dclipsham . Failed testing DROID on this file

No results found for file: fmt-1157-signature-id-1539.nfo. Expected: fmt/1157

fmt/1157 skeleton suite hasn't created a conformant file - https://www.nationalarchives.gov.uk/PRONOM/Format/proFormatSearch.aspx?status=detailReport&id=1967&strPageToDisplay=signatures

file should contain string' v200' which equates to 0x76323030 but this is missing from the file available via zenodo for v95 https://zenodo.org/record/3269467#.Xc1KGDP7SUk

ross-spencer commented 4 years ago

Thanks Jeremie, et al. I should have done something with this but forgot about it... Not sure I'll have a fix for a few weeks, but let me know if it does help you to have it sooner or if you can live without for this release.

Dclipsham commented 4 years ago

Thanks Ross, it just came up as part of DROID testing. I've created a pair of files that contain the missing 'v200' so a fix certainly isn't urgent from our perspective (am happy to share these - GitHub Issues doesn't like the file type though for file upload - let me know if you want them..). As Richard points out we cant be certain of the order in which 'Folio' and 'v200' will appear so expressed them both as distinct BOFs with offset range set

ross-spencer commented 4 years ago

Sounds good, if you zip them, the can be added here, and I can use them for reference in the fix!

I have of course excitedly being watching references to the suite in recent testing. Big changes. They look to be shaping up well.

Dclipsham commented 4 years ago

Great - learning something new :) thank you! fmt-1157-variants.zip

richardlehane commented 1 year ago

the latest release (v108) has a couple more signatures impacted by this: fmt/1739 and 1757

ross-spencer commented 1 year ago

Wondering if I fixed this for container signatures and whether this will work here: https://github.com/exponential-decay/skeleton-container-test-suite-generator/pull/14

richardlehane commented 1 year ago

fmt/1062 (https://www.nationalarchives.gov.uk/PRONOM/Format/proFormatSearch.aspx?status=detailReport&id=1868&strPageToDisplay=signatures) also seems to be impacted by this issue