richardlehane / siegfried

signature-based file format identification
http://www.itforarchivists.com/siegfried
Apache License 2.0
224 stars 30 forks source link

Panic: runtime error on .doc file #127

Closed VAIThomas closed 5 years ago

VAIThomas commented 5 years ago

Hi Richard,

A bit of a preface: I am not fully versed in digital archiving so my main experience with siegfried has been through the wiki and through trial and error. We are grabbing MD5 checksums from a digital archive to compare them with the original disk image, to check whether the archive has had any data loss. We use siegfried to get the file extensions and MD5 checksums.

I always include "-coe" to prevent siegfried from stopping when it encounters an error, however with this file the panic runtime error stops siegfried anyway.

PS S:\CK_20190128\CK_LESOPDRACHTEN\2009-2010\ZER\Bijlagen Onderwerp 2> sf ".\Bijlage 2.26 - Checklist Eerste College.doc
"
panic: runtime error: index out of range

goroutine 5 [running]:
github.com/richardlehane/siegfried/internal/containermatcher.(*ContainerMatcher).processHits(0x118bf6c0, 0x118efea0, 0x1, 0x1, 0x11909560, 0x118788d0, 0x11960720, 0x13, 0x11847300, 0x1)
        c:/gopath/src/github.com/richardlehane/siegfried/internal/containermatcher/identify.go:228 +0x57e
github.com/richardlehane/siegfried/internal/containermatcher.(*ContainerMatcher).identify(0x118bf6c0, 0x11870060, 0x2d, 0x7a5780, 0x118efe80, 0x11847300, 0x1191e340, 0x1, 0x1)
        c:/gopath/src/github.com/richardlehane/siegfried/internal/containermatcher/identify.go:145 +0x1e4
created by github.com/richardlehane/siegfried/internal/containermatcher.Matcher.Identify
        c:/gopath/src/github.com/richardlehane/siegfried/internal/containermatcher/identify.go:43 +0x1b5

Other than this file, the map also includes a .cpgz file, but this file isn't the one being analysed so presumably not the one causing the error. I have included both files in a .zip as attachment. Bijlagen_Onderwerp_2.zip

Thank you for your time.

richardlehane commented 5 years ago

thanks for reporting this @VAIarchief and welcome to GitHub! Superficially this looks like you might have encountered the same issue as #126 - I will confirm and let you know. Hopefully that's the case as I have a fix in progress for that all the best Richard

richardlehane commented 5 years ago

Hi @VAIarchief just to confirm: this is the same issue as #126. I've developed a fix that's working on the develop branch. I hope to be able to release it either later this week or over the coming weekend All the best Richard

richardlehane commented 5 years ago

Hi @VAIarchief - this now fixed in sf v1.7.12. Thanks again for the bug report Richard