richardlehane / siegfried

signature-based file format identification
http://www.itforarchivists.com/siegfried
Apache License 2.0
223 stars 30 forks source link

New version does not recognize a certain pdf file #156

Closed nvercamm closed 3 years ago

nvercamm commented 3 years ago

Analysing the attached pdf file results in "unknown format" : TF 32 - Tuinschermen.pdf

λ siegfried\sf.exe "c:\Users\xxx\Downloads\TF 32 - Tuinschermen.pdf"

siegfried : 1.9.1 scandate : 2021-01-07T15:13:33+01:00 signature : default.sig created : 2020-10-06T19:13:40+02:00 identifiers :

  • name : 'pronom' details : 'DROID_SignatureFile_V97.xml; container-signature-20201001.xml'

    filename : 'c:\Users\Nick\Downloads\TF 32 - Tuinschermen.pdf' filesize : 2736787 modified : 2021-01-07T13:07:41+01:00 errors : matches :

  • ns : 'pronom' id : 'UNKNOWN' format : version : mime : basis : warning : 'no match; possibilities based on extension are fmt/14, fmt/15, fmt/16, fmt/17, fmt/18, fmt/19, fmt/20, fmt/95, fmt/144, fmt/145, fmt/146, fmt/147, fmt/148, fmt/157, fmt/158, fmt/276, fmt/354, fmt/476, fmt/477, fmt/478, fmt/479, fmt/480, fmt/481, fmt/488, fmt/489, fmt/490, fmt/491, fmt/492, fmt/493, fmt/558, fmt/559, fmt/560, fmt/561, fmt/562, fmt/563, fmt/564, fmt/565, fmt/1129'
richardlehane commented 3 years ago

Hi Nick, something seems to be off with the file? Opening in a hex editor I just see a lot of zero bytes

nvercamm commented 3 years ago

Hey Richard,

you are right, something went wrong with those files.

Nick

Op do 7 jan. 2021 om 15:48 schreef Richard Lehane <notifications@github.com

:

Hi Nick, something seems to be off with the file? Opening in a hex editor I just see a lot of zero bytes

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/richardlehane/siegfried/issues/156#issuecomment-756161199, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPM5V4II4VP3CNWY4ZA4DTSYXCSZANCNFSM4VZBI3KQ .

-- [image: Zeticon] Nick Vercammen CTO +32 9 275 31 31 +32 471 39 77 36 nick.vercammen@zeticon.com https://www.facebook.com/MediaHaven-1536452166583533/ https://www.linkedin.com/company/zeticon/ https://twitter.com/mediahaven www.zeticon.com

richardlehane commented 3 years ago

Closing because the PDFs were malformed!