Closed kenn closed 11 years ago
Actually, yeah, I do have questions. What version of Ruby is this, and do you see this in your app or in the test suite for paperclip?
The error was reported via Airbrake on production. And there, we use ruby-1.9.3-p0 [i686] on Debian 6 32-bit.
Hmm, ok. Can you paste your paperclip attachment definition and the name of the file you're trying to upload?
I can only see "#<ActionDispatch::Http::UploadedFile:0xc562d70>" on Airbrake and can't check what was the file name for that. The paperclip definition is pretty straightforward -
has_attached_file :icon, { :styles => { :s36 => "36x36>", :s72 => "72x72>", :s144 => "144x144>" }, :default_style => :s72 }.merge(IMAGE_OPTIONS)
where IMAGE_OPTIONS is mostly storage options.
BTW the browser was Chrome on Windows 7 - my wild guess is that this guy used Japanese version (or other localized ver) of Windows, where the file name was uploaded as Shift-JIS and you get mojibake on the server side?
At this point I'm not sure if cocaine is the right place to report the problem - feel free to close. :)
I asked for the definition to see what your path
and url
looked like. I should have been more specific, sorry. I assume they're just listed as strings without any specific encoding here. In that case I'll have to try to come up with a failing testcase so it can be fixed.
For production, path
and url
are as follows - :base_class
is a custom interpolation.
:url => ":s3_domain_url",
:path => ":rails_root/public/system/:attachment/:base_class/:id/:style.:extension"
So yes, it's a string literal which encoding should inherit the source encoding, which is by default US-ASCII
.
On a side note, the good news is that starting Ruby 2.0 the default source encoding will be UTF-8
.
Just encountered this issue in production with a file named "Façade.jpg".
Will you accept a PR with only an addition of the #coding: UTF8
comment ?
@kenn, can you try applying @cfabianski's patch and reporting back if it fixes your problem?
I'm trying to fix the build with jruby 1.7.0 as travis is not happy with this change
And I ask this because, sadly, I'm unable to reproduce this using the https://github.com/thoughtbot/paperclip_demo repo. I uploaded a "Façade.jpg" without issue.
@jyurek what's your ENV
? (LC_LANG
and so on)
When you get into those troubles, it becomes hard
BTW, I need more time, master isn't compatible with jRuby 1.7. It has been tested only with jRuby 1.6.8
Ah, I have "LANG=en_US.UTF-8" in there. Maybe that has something to do with it. Thanks, I'll check this out later.
On my side I have LANG=fr_FR.UTF-8
@jyurek if you have an idea about why the build os broken on jRuby 1.7.0, please let me know :-)
I'll check it out tonight and let you know what I find.
I can't reproduce the problem on my Mac, probably because Macs hold filenames as UTF-8-MAC and browsers send them as UTF-8.
My guess is that the client that caused the error was a localized Windows client with ISO-8859-1 (or whatever) encoding, which sends filenames in their system encoding.
If so, I don't think magic comments in the source files will fix the problem.
ActionDispatch::Http::UploadedFile
has this method for original_filename
:
def encode_filename(filename)
# Encode the filename in the utf8 encoding, unless it is nil or we're in 1.8
if "ruby".encoding_aware? && filename
filename.force_encoding("UTF-8").encode!
else
filename
end
end
If my reasoning is correct, force_encoding
in this method is the root cause of the problem.
@kenn I'm able to reproduce the issue with the current master of cocaine. ArgumentError: invalid byte sequence in US-ASCII
If I change for my fork and the dedicated branch it does work
I just uploaded a filename with accented characters like this Image_à-çédric.jpg
Can you try again?
[EDIT] ruby 1.9.3-p0
with rvm
on Mac OS X 10.7.5
BTW, the magic comment should be added by default because UTF-8
is more a standard than US-ASCII
don't you think?
I have a server running with my own fork and I can confirm that my fix fixed all bugs pointing out this issue. Unfortunately, as usual with encoding issue, it's pretty hard to reproduce this in a test because you have to put the magic comment on top of your spec file to accent accented characters :-/
Ok I've pulled in @cfabianski's PR. @robworley, can you try this out and let me know if you have further problems?
@cfabianski's patch is working for me. Thanks all.
Great, thanks. I'll close this for now. Please let us know if this pops up again.
Would it be possible to get a new release of cocaine? I'm facing this bug and I'd rather use an official release than a git revision.
@ujh for now, you can do this gem "cocaine", '0.4.2', :git => "git://github.com/cfabianski/cocaine.git", :branch => '34-utf-8-encoding'
@cfabianski Thanks. I'm doing something along these lines for now.
Because the base string literals are US-ASCII and trying to interpolate UTF-8 strings into it?
Let me know if you need more information.