thoughtbot / paperclip

Easy file attachment management for ActiveRecord
https://thoughtbot.com
Other
9.01k stars 2.43k forks source link

is there a work around for Content-type incorrectly sent by Operating System #2170

Closed jasonfb closed 8 years ago

jasonfb commented 8 years ago

Here are two different log entries for Chrome submission of a CSV file. One is on Windows, and another on a Mac:

WINDOWS Chrome

2016-04-07T18:19:43.025920+00:00 app[web.1]:   Parameters: {"utf8"=>"✓", "authenticity_token"=>"xidmv33k4SShdW5KuKnSSz5r+k9D+MLUNQFcB2c04qM=", "record"=>{"delete_attachment"=>"true", "attachment"=>#<ActionDispatch::Http::UploadedFile:0x007f9e2293f0b8 @tempfile=#<Tempfile:/tmp/RackMultipart20160407-10-1elodis>, @original_filename="AprilFoolsFollowup_Customers.csv", @content_type="application/octet-stream", @headers="Content-Disposition: form-data; name=\"record[attachment]\"; filename=\"AprilFoolsFollowup_Customers.csv\"\r\nContent-Type: application/octet-stream\r\n">, "type"=>"UserTagImporter", "options"=>{"tag"=>"testtesttest"}}, "commit"=>"Create", "iframe"=>"true"}

MAC OS X Chrome

2016-04-07T18:22:11.403579+00:00 app[web.1]:   Parameters: {"utf8"=>"✓", "authenticity_token"=>"iDa5qy8LlcKv4EqezknZZDzg4Kk3T/4Knc1rbRem2Yg=", "record"=>{"attachment"=>#<ActionDispatch::Http::UploadedFile:0x007f9e183099c8 @tempfile=#<Tempfile:/tmp/RackMultipart20160407-16-1qeh446>, @original_filename="AprilFoolsFollowup_Customers.csv", @content_type="text/csv", @headers="Content-Disposition: form-data; name=\"record[attachment]\"; filename=\"AprilFoolsFollowup_Customers.csv\"\r\nContent-Type: text/csv\r\n">, "type"=>"UserTagImporter", "options"=>{"tag"=>"test1"}}, "commit"=>"Create", "iframe"=>"true"}

As you can see, on Mac, the content_type is set correctly to text/csv. On Windows, I get application/octet-stream

This is the SAME file uploaded from two different operating systems, both running Chrome

Upon investigation, I have learned that CSV content-type detection on Windows is FUBAR (seriously FUBAR). I have learned that the suggested fix is for the user to EDIT THEIR WINDOWS REGISTRY (caps added for emphasis). Yes, that is actually what the supposed correctly solutions is.

http://stackoverflow.com/questions/1201945/how-is-mime-type-of-an-uploaded-file-determined-by-browser

However, I tried editing my windows registry and it didn't even work (it still uploaded with content-type: octet-stream)

Is there a suggested way to work around this problem? Based on this research, validating on the browser-provided content-type seems to be basically useless, as you are never really guaranteed to a correct content-type.

tomash commented 8 years ago

Use can use file command. There should be some wrappers in paperclip for it, like http://www.rubydoc.info/github/thoughtbot/paperclip/Paperclip/ContentTypeDetector

jasonfb commented 8 years ago

@tomash -- the problem is it doesn't work. as it enumerates the rules:

  1. Blank/Empty files: --> does not apply to me
  2. Calculated match: Return the first result that is found by both the file command and MIME::Types. --> My files end in .csv but have a MIME-type of application/octet-stream (so I think this doesn't apply to me)

3.Standard types: Return the first standard (without an x- prefix) entry in MIME::Types --> I think this means that the object will pick up the MIME type as application/octet-stream which is incorrect and thus incorrectly flags this as an invalid upload

when I use the content_type validator for text/csv, it works from Mac OS computers but from Windows I get:

screen shot 2015-11-25 at 9 42 53 am

this is the bug I'm trying to solve, it appears from what I can tell, the part of the paperclip code you pointed me to is in fact the problem.

jasonfb commented 8 years ago

basically I guess what I'm proposing is that this be refactored so that if 1) the upload is coming from Windows, and 2) the MIMe-type comes through as application/octet-stream, we actually prefer the file extension. Or perhaps this nuance could be configured within the validates_attachment_content_type settings themselves? Looking for guidance here.

I guess I could simply add application/octet-stream to my allowed list of content_type on the validation, but this seems like a band-aid to something that is actually incorrect inside of Paperclip, no?

Just trying to figure out if my thinking is correct

tute commented 8 years ago

I guess I could simply add application/octet-stream to my allowed list of content_type on the validation, but this seems like a band-aid to something that is actually incorrect inside of Paperclip, no?

Mime types are hard. The problem is not inside anywhere, it's in the interconnection of different OSs, browsers, standards, and app configurations.

Have you gone through https://github.com/thoughtbot/paperclip#security-validations?

Because there is no simple answer or solution that would fix this high level description of the problem, I'll close this issue, and defer to the more specific open ones: https://github.com/thoughtbot/paperclip/labels/Spoof%20related%20or%20Mime%20types.

Thanks.

jasonfb commented 8 years ago

As I reported on #1924, please note that CSV files come through as application/octet-stream if you don't have Excel install on your Windows machine. Once you install Excel, your CSV files come trough as application/vnd.ms-excel

@tute My primary goal is to support users, and they are shocked to hear that on a website -- web technology famous for being cross-platform -- there is something that only works on a Mac but not on Windows.

I guess the thing to do here is is to add application/vnd.ms-excel as a content type mappings for the csv suffix , then tell my Windows users to install MS Excel (which sounds slightly nuts but isn't actually the craziest thing).

Does that seem like it makes sense?

ssinghi commented 8 years ago

@tute I don't think you understand the issue, and your link to security validations in the Paperclip is of utterly no help. There is a problem in how paperclip is doing the media type spoof validation, and what exacerbates the problem is that it can't be disabled.

jasonfb commented 8 years ago

I think the problem here is that trusting content-type is fundamentally flawed. as long as you are looking at or relying on the content-type headers you have a fundamental problem of not really being able to trust them, even though our natural tendencies as developers is to assume consistency and repeatability.

I think that's the problem here. Yes, I agree that there should be a way to turn them off globally.. I think that used to exist in older versions of Paperclip but has been removed?

BalaSudheer commented 5 years ago

I could able to fix this issue with below piece of code. So that the Content-Type will be set properly.

data.append('file', new Blob([fileData], { type: 'text/csv' }));

sbn111 commented 4 years ago

I could able to fix this issue with below piece of code. So that the Content-Type will be set properly.

data.append('file', new Blob([fileData], { type: 'text/csv' }));

Can you explain more how to do that in angular, Every time I upload csv file, the mime type detects as excel type.

I have my code like this, "formData.append('files', file);" how to make it correct content type.

ssinghi commented 4 years ago

@sbn111 in newer version of paperclip it is doing content-type detection by reading from the file. If you still encounter any issues, file a bug report at: https://github.com/kreeti/kt-paperclip

zeevbritz commented 3 years ago

I could able to fix this issue with below piece of code. So that the Content-Type will be set properly. data.append('file', new Blob([fileData], { type: 'text/csv' }));

Can you explain more how to do that in angular, Every time I upload csv file, the mime type detects as excel type.

I have my code like this, "formData.append('files', file);" how to make it correct content type.

formData.append('files', new Blob([file], { type: 'text/csv' }));