kreeti / kt-paperclip

Easy file attachment management for ActiveRecord
Other
276 stars 95 forks source link

Content Type Spoof: For XML file in v7.1 #85

Open rocket-turtle opened 2 years ago

rocket-turtle commented 2 years ago

Describe the bug After updating to the version v7.1 we get this 'error':

Content Type Spoof: Filename 534fa0c71f8a678c08e6fbc40e4e362d20220215-98201-1po9m4z.xml (application/xml from Headers, ["application/xml", "text/xml"] from Extension), content type discovered from file command: text/html. See documentation to allow this combination.

I think the detection for the content_type changed with https://github.com/kreeti/kt-paperclip/pull/75

New behavior:

@filepath = "/var/folders/_6/l152jtnn025b5nfgcvss7v200000gp/T/534fa0c71f8a678c08e6fbc40e4e362d20220215-98201-1po9m4z.xml"
"/var/folders/_6/l152jtnn025b5nfgcvss7v200000gp/T/534fa0c71f8a678c08e6fbc40e4e362d20220215-98201-1po9m4z.xml"

Marcel::MimeType.for Pathname.new(@filepath), name: @filepath
"application/xml"

Old behavior:

Marcel::Magic.by_magic(@filepath)
nil

# fallback for marcel detection
FileCommandContentTypeDetector.new(@filepath).detect
"text/html"

Mime type from file (Mac OS)

$ file --version
file-5.39
magic file from /usr/share/file/magic

$ file -b --mime /var/folders/_6/l152jtnn025b5nfgcvss7v200000gp/T/534fa0c71f8a678c08e6fbc40e4e362d20220215-98201-1po9m4z.xml
text/html; charset=utf-8

Correct solution?

Paperclip.options[:content_type_mappings] = {
  xml: %w[application/xml text/html]
}

Is this the best way to handle this Problem? Are there any other known content_type changes?

Is there a best of for Paperclip.options[:content_type_mappings]?


Update I found out that it depends on the XML file what the file command returns:

text/xml; charset=us-ascii
text/html; charset=utf-8
text/plain; charset=us-ascii

So my updated solution would be:

Paperclip.options[:content_type_mappings] = {
  xml: %w[application/xml text/xml text/html text/plain]
}
ssinghi commented 2 years ago

@rocket-turtle can you please provide a sample xml file for us to test and work on the issue. Thanks!

rocket-turtle commented 2 years ago

@ssinghi I can not upload an XML file here.

I used this from https://www.w3schools.com/XML/xml_namespaces.asp

<?xml version="1.0" encoding="UTF-8"?>
<table xmlns="http://www.w3.org/TR/html4/">
  <tr>
    <td>Apples</td>
    <td>Bananas</td>
  </tr>
</table>
file -b --mime example.xml
text/xml; charset=us-ascii

Marcel::MimeType.for Pathname.new('example.xml'), name: 'example.xml'
"application/xml"

When I removed the otional XML Prolog (https://www.w3schools.com/xml/xml_syntax.asp) I get the same result from Marcel::MimeType.for but file returns text/html:

file -b --mime example.xml
text/html; charset=us-ascii
jocel1 commented 2 years ago

Hi!

I'm also getting this kind of warnings. e.g:

INFO: [paperclip] Content Type Spoof: Filename 20222022_Teaser_Silex.pdf (application/pdf from Headers, [“application/pdf”] from Extension), content type discovered from file command: inode/x-empty. See documentation to allow this combination.
Content Type Spoof: Filename Règles_de_gouvernance_.pdf (application/pdf from Headers, [“application/pdf”] from Extension), content type discovered from file command: inode/x-empty. See documentation to allow this combination.