richardlehane / siegfried

signature-based file format identification
http://www.itforarchivists.com/siegfried
Apache License 2.0
224 stars 30 forks source link

Identification of Broadcast Wave #140

Closed asciim0 closed 4 years ago

asciim0 commented 4 years ago

Hi, I have Broadcast Wave files, which DROID returns - as expected - multiple fmt hit (on fmt/142 and fmt/704) on, whereas Siegfried only returns fmt/142 (see output pasted below). It seems that DROID picks up on the bext chunk, whereas Siegfried doesn't. The original file is a bit large to upload here, but I'm happy to provide it through other means, if needed. Not sure if this is a general problem yet, I'll add to the issue if I come across more. Thanks, Micky


siegfried : 1.8.0 scandate : 2020-04-20T08:38:46+02:00 signature : default.sig created : 2020-01-21T23:30:42+01:00 identifiers :

richardlehane commented 4 years ago

Hi Micky Lovely to hear from you! I think I know what's going on here: because PRONOM doesn't define a priority relationship between fmt/142 and fmt/704, as soon as sf gets the fmt/142 result, it quits because there isn't anything defined as being stronger that it should wait for. I.e. sf is applying the priority rules as it scans and uses them to quit early if it can. DROID on the other hand keeps scanning and applies the priority rules at the end. You can change this behaviour by customising your signature file with roy. The command for this is: roy build -multi comprehensive (see here for docs). This will likely return more results than DROID does as it will return all positive matches (e.g. generic wav too). I don't yet have a flag to copy DROID's behaviour of doing complete scan then applying format priorities at the end but I think should be doable to add if it's a feature you'd like (in which case, leave the ticket open as a feature request). It may be worth flagging this with the PRONOM team to see if they really intend to return multiple results here, or whether this is just a case of a priority relationship having been overlooked. cheers Richard

asciim0 commented 4 years ago

Hey Richard, thanks for the quick reply! I think you are right about something being amiss with the PRONOM priority relationship. I assume the best thing to do is to close the issue here then? I'm not sure what effect your PRONOM tag has? Cheers, Micky

richardlehane commented 4 years ago

the PRONOM tag has no effect (other than categorisation for me!), will close this thanks Micky