Closed Michel20367 closed 6 years ago
Hey, thanks for the detailed bug report. I'll do some investigation and implement a fix.
Hi I have researched a bit further, the PowerPoint file. It has at offset 0x480 the byte sequence "PowePoint Document". (0x50,0x00,0x6f,0x00,0x77,0x00,0x65,0x00,0x72,0x00,0x50,0x00,0x6f,0x00,0x69,0x00,0x6e,0x00,0x74,0x00,0x20,0x00,0x44,0x00,0x6f,0x00,0x63,0x00,0x75,0x00,0x6d,0x00,0x65,0x00,0x6e,0x00,0x74,0x00) At the offset 0x200 the sequence is "ýÿÿÿ§" (0xFD, 0xFF, 0xFF, 0xFF, 0xFF, 0xA7), but I'm not sure if it is available for all files. There is another variant of ppt file (see attachment). If it was created in one of the old Office versions, it has at the offset 0x200 sequence .n.ð(0x00, 0x6E , 0x1E, 0x F0) An example can be found in the attachment. pptx-test_comp.zip Good Link: https://www.garykessler.net/library/file_sigs.html I hope this helps.
It looks as though there are a few variations for the PowerPoint subheader, which are:
00 6E 1E F0
0F 00 E8 03
A0 46 1D F0
FD FF FF FF 0E 00 00 00
FD FF FF FF 1C 00 00 00
FD FF FF FF 43 00 00 00
I'm catching the last three with FD FF FF FF
, I assume the first three are from an older version of PowerPoint. Most likely I'll change the PowerpointLegacy class to catch all different signatures if I can't determine the individual formats.
I've implemented the Compound File Binary format (or at least, the header part) which should solve this issue. How this works is that the root entry can be located at different positions in the file, which we can determine by reading the CFB header and checking the first directory sector location. Once we have the root entry, we can read the object type CLSID which allows us to determine the type of file.
For some reason, when saving a message from Outlook the first directory sector location is set to 1, but when using drag-and-drop it is set to 2. No idea why Outlook saves the files like that, but checking the CFB header allows for both cases to be handled :)
I've pushed a prerelease to NuGet, have a look and let me know if it works for you.
Thank you! I will check it this week.
I tested the 2.0 rc with my sample files. Everything is recognized correctly. Good work!
Thanks for checking! I'll publish the release version over the weekend.
If a .Msg file is saved from Outlook via drag and drop, the Root Entry ( 0x52, 0x00, 0x6F, 0x00, 0x6F, 0x00, 0x74, 0x00, 0x20, 0x00, 0x45, 0x00, 0x6E, 0x00, 0x74, 0x00, 0x72, 0x00, 0x79, 0x00 ) is on the offset 0x400 and not 0x200 as usual. At offset 0x200 there is the sequence (0xFD, 0xFF, 0xFF, 0xFF) which you use for detection of "application/vnd.ms-powerpoint", "ppt" files. Consequently, such .msg files (see example in attachment) are incorrectly detected as "ppt" files.
Best Regards
Michael Ioshchikhes
Externe Telefonate.zip