Word / Excel 97-2003 files may be detected as MSDOC type and not WORD or EXCEL.
I faced the issue by creating a blank Excel file in Excel 2007 and save it as XLS (check blank.zip)
From what I could check, Office 97-2003 file signatures are based on "subheaders" and there might have several of them without a clear documentation. However the library would detect it as MSDOC type.
I would therefore suggest to
Rename MSDOC type as MS_OFFICE to be more accurate
Add the list of known MS office extensions (at least doc,ppt,xls)
Something like
// OLECF - Object Linking and Embedding (OLE) Compound File (CF)// Compound Binary File format by Microsoft, used by Microsoft Office 97-2003 applications(Word, Powerpoint, Excel, Wizard)public readonly static FileType MS_OFFICE = new FileType(new byte?[] { 0xD0, 0xCF, 0x11, 0xE0, 0xA1, 0xB1, 0x1A, 0xE1 }, "doc,ppt,xls", "application/octet-stream");
Since the type appears after WORD and EXCEL types, the detection would first match based on subheaders and default to this one if the subheader does not match.
Word / Excel 97-2003 files may be detected as MSDOC type and not WORD or EXCEL. I faced the issue by creating a blank Excel file in Excel 2007 and save it as XLS (check blank.zip)
From what I could check, Office 97-2003 file signatures are based on "subheaders" and there might have several of them without a clear documentation. However the library would detect it as MSDOC type.
I would therefore suggest to
Something like
// OLECF - Object Linking and Embedding (OLE) Compound File (CF)
// Compound Binary File format by Microsoft, used by Microsoft Office 97-2003 applications(Word, Powerpoint, Excel, Wizard)
public readonly static FileType MS_OFFICE = new FileType(new byte?[] { 0xD0, 0xCF, 0x11, 0xE0, 0xA1, 0xB1, 0x1A, 0xE1 }, "doc,ppt,xls", "application/octet-stream");
Since the type appears after WORD and EXCEL types, the detection would first match based on subheaders and default to this one if the subheader does not match.