brianmario / escape_utils

Faster string escaping routines for your ruby apps
MIT License
513 stars 52 forks source link

Encoding.default_internal is not used in some case #13

Closed jugyo closed 13 years ago

jugyo commented 13 years ago

OK:

it "should use Encoding.default_internal" do
  Encoding.default_internal = Encoding.find('utf-8')
  EscapeUtils.unescape_url("http%3A%2F%2Fwww.homerun.com%2F".force_encoding("US-ASCII")).encoding.should eql(Encoding.default_internal)
end

NG:

it "should use Encoding.default_internal" do
  Encoding.default_internal = Encoding.find('utf-8')
  EscapeUtils.unescape_url("%E2%9C%93".force_encoding("US-ASCII")).encoding.should eql(Encoding.default_internal)
end
brianmario commented 13 years ago

After re-thinking the encoding strategy a bit, I came to the conclusion that we probably shouldn't be converting the string to Encoding.default_internal even if it's set. Instead, the return value will be set to the same encoding as the input string. This is consistent with how most all pure-ruby escaping routines handle this today.

closed by 3f0523a2effbbad0641bff84c5f5f58ed8efe30d

jugyo commented 13 years ago

Basically, I agree that. But the behavior of 'escape_utils/url/rack' differ from the original one. The original Rack::Utils.unescape converts string to 'ASCII-8BIT' in some case.

That difference confuse me. Unfortunately I have no idea about that problem.

Thanks.

brianmario commented 13 years ago

Oh interesting, I'll investigate how Rack::Utils handles it

jugyo commented 13 years ago

In Rack::Utils.unescape, String#gsub converts the encoding.

example:

'abc'.force_encoding('us-ascii').gsub('b', 'ビー').encoding
=> #<Encoding:UTF-8>