php / php-src

The PHP Interpreter
https://www.php.net
Other
38.02k stars 7.73k forks source link

finfo_file is returning the wrong mime type for macro laden Excel Binary files #14594

Closed ond-danny closed 3 months ago

ond-danny commented 3 months ago

Description

Description finfo_file returns incorrect mime type on xlsb files.

The following code:

<?php

    $path = 'some/path/excelbinary.xlsb';
    $finfo = finfo_open(FILEINFO_MIME_TYPE);
    $mimeType = finfo_file($finfo, $path);
    finfo_close($finfo);
    echo $mimeType;

Resulted in this output:

application/vnd.ms-excel.sheet.binary.macroEnabled.12

But I expected this output instead:

application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

I'm not sure if this is the right place to report this. It does seem to be a security concern if a binary, macro enabled file can be uploaded and interpreted as a plain excel file...

PHP Version

PHP 8.1.26, PHP 8.1.29

Operating System

Windows 10, Windows 11,Unix (Github CI/CD)

damianwadley commented 3 months ago

This sounds like one of those times where MIME detection just isn't good enough: it attempts to make a guess based on the first few bytes of a file, but there are so many file types out there that look the same as each other. Like when MS Office switched to the ZIP+XML-based open format, a lot of people discovered that their new .docx files were being detected as ZIP files - correctly, mind you.

I don't think this is going to be fixable in a general way: MIME magic data just isn't sophisticated enough to read through a compressed ZIP file and discover whether its contents look like a Microsoft-generated file or not.

If you have specific knowledge about the types of files your users will be uploading, like that they will be uploading documents, then I suggest combining the MIME result with the actual extension of the file. Meaning that if you detect "application/vnd.ms-excel.sheet.binary.macroEnabled.12" and the extension is .xlsb (or perhaps regardless of the extension) then your application overrides that with your desired "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet".

There is another option too: get yourself a specific "magic file" that detects files as you want. You may have to shop around to find a file that does so, or possibly even build/edit your own, but then you can provide that magic file for finfo to use when examining the input files and it will report the MIME types as you want.

I'm not sure if this is the right place to report this. It does seem to be a security concern if a binary, macro enabled file can be uploaded and interpreted as a plain excel file...

Like I said, MIME detection is unreliable. Anyone using it alone to protect themselves against malicious files will not be safe - other mechanisms, like reprocessing files and disabling execution and potentially even server-side virus scanners, are 100% necessary.

github-actions[bot] commented 3 months ago

No feedback was provided. The issue is being suspended because we assume that you are no longer experiencing the problem. If this is not the case and you are able to provide the information that was requested earlier, please do so. Thank you.