Closed mstavrev closed 5 years ago
It's failing because the code is looking for FD FF FF FF
as the subheader (at position 0x200) whereas your sample has 0F 00 E8 03
. There appears to be a few possibilities for PowerPoint files.
One possibility is change the class to look for all the different possible identifiers and return a match if any of them were found. I'm also looking into implementing a more thorough implementation of the Compound Binary File format to retrieve the CLSID to fix #7 which will hopefully be a better long-term solution.
Thanks for the update.
Looks like you've listed (on #7) all of the mentioned sequencing for PPT that I can also find on this list https://www.garykessler.net/library/file_sigs.html
It would be best if 1st a detection of MS-CFB format is performed (the header is at offset 0 and it is pretty long and unique to be a false positive), then look for the CLSID of PowerPoint, which if understand correctly should be {64818d10-4f9b-11cf-86ea-00aa00b929e8}. I followed the information available here http://fileformats.archiveteam.org/wiki/Microsoft_Compound_File and I did locate this CLSID at least on my file.
If you follow the link for the PPT on that wiki, you can also get a collection of old PPT files that can be used for additional testing: https://web.archive.org/web/20020313074855/http://ftp.sunet.se/pub/Internet-documents/isoc/charts/presentations/
Cheers
Hey, I've implemented the CFB format and rewritten all the legacy Office types to use it instead of the subheader check. It now correctly identifies your attached sample.
I've published a prerelease version to NuGet, if all looks good I've push a final release in the next day or so.
Thanks for the update. I've updated to 2.0.0-rc and now can see the library correctly identifying the problematic file. I've also did a few quick tests with Excel and Word documents saved to the old format that also work as expected.
I am attaching a PPT PowerPoint presentation file that for some reason is not identified by the library.
testPPT_mit.zip
It is a simple presentation, created to test the functionality of the library. I noticed that using just two slides works fine.