richardlehane / siegfried

signature-based file format identification
http://www.itforarchivists.com/siegfried
Apache License 2.0
224 stars 30 forks source link

sf panics scanning fmt/1172 file of skeleton suite with freedesktop.org identifier #125

Closed richardlehane closed 5 years ago

richardlehane commented 5 years ago

discovered by @ross-spencer to reproduce: 1) use deluxe.sig or freedesktop.sig; 2) scan fmt-1172 file within skeleton suite Panics at line 456 of mimeinfo/identifier.go applyscore method.

richardlehane commented 5 years ago

maybe due to the fact that freedesktop.org has a double-entry for font/woff

richardlehane commented 5 years ago

I've tested a fix on the develop branch that reports an error where a mimeinfo signature file duplicates IDs. In doing so, discovered both the freedesktop.org.xml and tika-mimetypes.xml files have duplicate IDs. In both cases, these appear to be errors in those files: