rails / rails

Ruby on Rails
https://rubyonrails.org
MIT License
56.15k stars 21.7k forks source link

i18n fails with multibyte Strings in Ruby 1.9 (similar to #2038) #600

Closed lighthouse-import closed 13 years ago

lighthouse-import commented 13 years ago

Imported from Lighthouse. Original ticket at: http://rails.lighthouseapp.com/projects/8994/tickets/2188 Created by Jonas Nicklas - 2010-11-25 12:27:59 UTC

In Ruby 1.9 translating Strings which have non-ascii characters in them does not work for me.

If I have keys like this is my translation file:

"sv":
  test1: blah
  test2: blåh

Calling this works fine:

I18n.translate(:test1)

However, calling this raises an exception:

I18n.translate(:test2)

Here's the error:

ActionView::TemplateError (incompatible character encodings: ASCII-8BIT and UTF-8)

This is the same error as in #2038, I did run this against edge Rails and it looked like the patch from #2038 has been applied, so I am assuming this is a different issue.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Jonas Nicklas - 2009-03-18 22:19:12 UTC

I figured out now that I18n isn't the culprit. I18n.t returns a UTF-8 string, the issue seems to be that templates by default are ASCII-8BIT encoded, and when a UTF-8 string is used they switch over.

<%= "å" %><%= "å".encoding %>

Works, and would return 'åASCII-8BIT'

<%= "å".force_encoding('utf-8') %>

Also works. However:

<%= "å" %><%= "å".force_encoding('utf-8') %>

Fails with the above mentioned error.

I have attached a test case that proves the bug.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Mauricio Eduardo Szabo - 2009-03-26 01:36:44 UTC

I confirm this error on Rails 2.3.2 and Ruby1.9.

If, for example, I add on one controller: @errors = ["Á", 'Bê']

On any view, a simple: <%= @errors.inspect %>

throws the error incompatible character encodings: ASCII-8BIT and UTF-8

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Hector E. Gomez Morales - 2009-03-27 18:58:06 UTC

The problem is erb code in ruby 1.9 distribution. When it compiles the template code it forces a 'ASCII-8bit' encoding, the problem is when the template code has multibyte characters the template code is returned in a 'ASCII-8bit' string and when this string is concat with a 'UTF8' string with multibyte character the exception is raised because the strings between this encodings are only compatible when both only have seven-bit characters.

This patch is the result of my research for my proposal for end to end encoding support for rails. I am working for a patch for erb to resolve this problem. The included patch is quick hack to force the encoding of the template method code to be utf-8.

1988 is a duplicate of this bug.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by crazy_bug (at terletzki) - 2009-04-06 12:57:49 UTC

Hello! We've got the same problem! Only the error occurs when we fetch data from the database. We're using Mysql and Charset is UTF-8, but the Active Record returns ASCII-8BIT. Is it possible to do similar changes to the activerecord as you did to the actionpack? Seems as we're not the only ones with that problem (http://groups.google.com/group/rubyonrails-talk/browse_thread/thread/45cf95921c8fe21f/8864497725a0a4af?lnk=raot). Can somebody help me with this? Thanks!

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Hector E. Gomez Morales - 2009-04-06 14:03:36 UTC

I will take a look I will post any findings

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Hector E. Gomez Morales - 2009-04-10 15:38:13 UTC

Hi, sorry to be so late but I got some solutions to this problem please take a look to #2476

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Mauricio Eduardo Szabo - 2009-04-13 13:18:52 UTC

Hector, sorry but this is not my problem. My problem is not when I fetch data from a database, it's on template rendering, as I shown on my previous post. The ERB Workaround, by the way, worked for me.

(By the way, I use the postgres-pr adapter to fetch data from my database)

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by hkstar - 2009-04-19 22:12:21 UTC

+1 to Hector's workaround-erb.diff patch, works for me.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Portfonica - 2009-04-20 00:07:44 UTC

I'm afraid hector's patch doesn't resolve the problem with another template system such as HAML. :-/ So I think this patch isn't useful.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Hector E. Gomez Morales - 2009-04-21 00:14:20 UTC

This patch is concerned with erb as the default templating engine, that I think a lot of people use. If you have a particular haml template that presents the same problem can you provide it so I can dig out the proper fix.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by qoobaa - 2009-05-12 22:31:00 UTC

I've created the patch that fixes problems described by Jonas Nicklas. Now everything in views is encoded using UTF-8. The bad news are that a lot of things are broken now. Described problems with HAML may be caused by Rack params encoding (ASCII-8BIT), sqlite3-ruby strings encoding (ASCII-8BIT). I've created the ticket in Rack's lighthouse, we need also to fix sqlite3-ruby gem. Does anybody use mysql or pg gems? Are they broken also?

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Mauricio Eduardo Szabo - 2009-05-13 15:39:37 UTC

with templates_using_utf_8_encoding patch, I confirm there are problems with the postgresql gem (even with the postgres-pr gem).

One more thing, now line errors on templates are wrong (when I have an error on line #14, rails says it's on line #15).

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Manfred Stienstra - 2009-05-13 15:46:03 UTC

Also, the utf_8_encoding patch assumes that everyone will want to use UTF-8 in their templates, this might not be the case.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by qoobaa - 2009-05-13 16:14:40 UTC

UTF-8 encoding may be changed easily (we can put ome variable there), but we've to provide some configuration for that (in environment.rb?). I've fixed issue in sqlite3-ruby gem (http://github.com/qoobaa/sqlite3-ruby), however it has no UTF-16 support yet (to be done). I've tried to fix pg gem, but I need to read Posgtres documentation first to do it (the version in my repository uses UTF-8 as default encoding). I've also created the ticket on Rack's lighthouse.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Portfonica - 2009-05-26 11:41:25 UTC

Hector: I don't any solution, I have only a hack. You can put those lines into your environment.rb

Encoding.default_internal = 'utf-8' Encoding.default_external = 'utf-8'

Oh, hack != solution :)

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Anton Ageev - 2009-05-31 11:54:21 UTC

Strings in params[] have ASCII-8BIT encoding too. Is it Rack issue?

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Anton Ageev - 2009-05-31 12:00:36 UTC

Hector: I don't any solution, I have only a hack. You can put those lines into your environment.rb

Encoding.default_internal = 'utf-8'
Encoding.default_external = 'utf-8'

This doesn't work for me.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by qoobaa - 2009-05-31 12:42:18 UTC

The params ASCII-8BIT encoding is a Rack issue: http://rack.lighthouseapp.com/projects/22435/tickets/48-rackutilsunescape-problems-in-ruby-191#ticket-48-1

Changing the default internal and external encoding also doesn't work in my app.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Adam S - 2009-07-22 15:56:17 UTC

I'm also seeing this issue with Ruby 1.9 and HAML templates. It's very annoying and confusing... I think the only solution is for Rails to set a default encoding in environment.rb and then do translation from other encodings... Raising errors for every encoding type is silly. I basically want my whole app to use utf8, others may want another encoding, then fine just put it in environment.rb.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Jérôme - 2009-08-15 23:44:46 UTC

It would be just definitely great if rails could avoid us editing all our files containing unicode characters, all ruby files. I feel like getting a regression with ruby1.9 when I have to add a # encoding: utf-8 header to my hundreds of files...

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Rocco Di Leo - 2009-08-26 02:44:33 UTC

I also would like to see a environment line where one can set the application wide encoding instead of adding magic comments to all files. Also i did not manage to add magic comments to the .erb files, how would this work?

At least those two possibilities did not work for me:

<%#= encoding: utf-8 %> <%# encoding: utf-8 %>

-act

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Jeremy Kemper - 2009-08-26 05:58:41 UTC

You guys aren't saying where you get these Encoding errors. Please include backtraces or, better, failing test cases so we can reproduce.

UTF-8 is already the default external encoding. The magic comments are only if you want to write a template in a different encoding than the default.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Rocco Di Leo - 2009-08-26 12:32:32 UTC

Reproduce the problem by using this process

rails utf8errors -d mysql
# add credentials if necessary to config/database.yml
cd utf8errors
rake db:create
script/generate controller utf8errors index
script/generate model user

# add "t.string :name" to the migration file before next step
rake db:migrate
touch app/views/utf8errors/_partial.html.erb
echo "Multibyte String öäü works here" >> app/views/utf8errors/_partial.html.erb
echo "Multibyte String öäü works here" >> app/views/utf8errors/_partial.html.erb
echo "Inserting User with multibyte characters" >> app/views/utf8errors/_partial.html.erb
echo "<% User.create(:name => 'Multibyte Username öäü') %>" >> app/views/utf8errors/_partial.html.erb
echo "Multibyte String from database does NOT work now:" >> app/views/utf8errors/_partial.html.erb
echo "<% User.all.each do |u| %>" >> app/views/utf8errors/_partial.html.erb
echo "<%= u.name %>" >> app/views/utf8errors/_partial.html.erb
echo "<% end %>" >> app/views/utf8errors/_partial.html.erb
echo "<%= render :partial => 'partial' %>" >> app/views/utf8errors/index.html.erb 

take care to use ruby 1.9.1 when starting the server

./script/server

=> surf to http://127.0.0.1:3000/utf8errors # should display the error


The error does NOT appear when using the workaround patch by hector which adds the line
source.force_encoding('utf-8') if '1.9'.respond_to?(:force_encoding) to the actionpack/lib/action_view/renderable.rb

Backtrace:
``` backtrace
ActionView::TemplateError (incompatible character encodings: ASCII-8BIT and UTF-8) on line #7 of app/views/utf8errors/_partial.html.erb:
4: <% User.create(:name => 'Multibyte Username öäü') %>
5: Multibyte String from database does NOT work now:
6: <% User.all.each do |u| %>
7: <%= u.name %>
8: <% end %>

    app/views/utf8errors/_partial.html.erb:7:in `concat'
    app/views/utf8errors/_partial.html.erb:7:in `block in _run_erb_app47views47utf8errors47_partial46html46erb_locals_object_partial'
    app/views/utf8errors/_partial.html.erb:6:in `each'
    app/views/utf8errors/_partial.html.erb:6
    app/views/utf8errors/index.html.erb:3
    <internal:prelude>:8:in `synchronize'
    /usr/local/lib/ruby19/1.9.1/webrick/httpserver.rb:111:in `service'
    /usr/local/lib/ruby19/1.9.1/webrick/httpserver.rb:70:in `run'
    /usr/local/lib/ruby19/1.9.1/webrick/server.rb:183:in `block in start_thread'

Rendered rescues/_trace (80.9ms)
Rendered rescues/_request_and_response (0.9ms)
Rendering rescues/layout (internal_server_error)

I hope this helps and that the formatting is working...

-act

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Rocco Di Leo - 2009-08-26 12:35:16 UTC

okay the formatting is kinda broken but i think you get the idea ... in addition, it should fail with Ruby 1.9 (instead of 1.9.1) as well. Also the problem arises with postgresql too (havent tested sqlite3.

-act

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Rocco Di Leo - 2009-08-26 13:11:35 UTC

One more note. I just rechecked this process with postgresql 8.4 and in this case the workaround-erb patch by hector is NOT working. Sorry for the confusion before. So summarized for my Setup:

Ruby 1.9.x + Rails 2.3.3 + Mysql 5.1 => not working Ruby 1.9.x + Rails 2.3.3 + with hector patch + Mysql 5.1 => working Ruby 1.9.x + Rails 2.3.3 (with and without hector patch) + Postgresql 8.4 => not working

Greets -act

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Adam S - 2009-08-26 13:11:54 UTC

Are you sure this isn't the mysql gem?

I used the process here: and I can create multibyte users etc. In fact no real issues with multibyte now...

http://www.taylorluk.com/articles/2009/08/12/ruby-19-and-passenger

Also used this adapter for sqlite3:

http://github.com/qoobaa/sqlite3-ruby/tree/master

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Rocco Di Leo - 2009-08-26 14:25:53 UTC

thank you, i updated from mysql gem 2.7 to the self-built 2.81 .. with the process above i got the error 'uninitialized constant Encoding::UTF' accessing http://localhost:3000/utf8errors using Rails 2.3.3

when i changed in Rack::utils.rb

RUBY_VERSION >= "1.9" ? result.force_encoding(Encoding::UTF-8) : result

for

RUBY_VERSION >= "1.9" ? result.force_encoding('utf-8') : result

the rendering worked indeed.

Here is the output without alteration for the interested:


[2009-08-26 16:19:01] ERROR NameError: uninitialized constant Encoding::UTF
    /usr/local/lib/ruby19/gems/1.9.1/gems/activesupport-2.3.3/lib/active_support/dependencies.rb:105:in `rescue in const_missing'
    /usr/local/lib/ruby19/gems/1.9.1/gems/activesupport-2.3.3/lib/active_support/dependencies.rb:94:in `const_missing'
    /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/utils.rb:27:in `unescape'
    /usr/local/lib/ruby19/gems/1.9.1/gems/rails-2.3.3/lib/rails/rack/static.rb:36:in `file_exist?'
    /usr/local/lib/ruby19/gems/1.9.1/gems/rails-2.3.3/lib/rails/rack/static.rb:18:in `call'
    /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/urlmap.rb:46:in `block in call'
    /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/urlmap.rb:40:in `each'
    /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/urlmap.rb:40:in `call'
    /usr/local/lib/ruby19/gems/1.9.1/gems/rails-2.3.3/lib/rails/rack/log_tailer.rb:17:in `call'
    /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/content_length.rb:13:in `call'
    /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/handler/webrick.rb:46:in `service'
    /usr/local/lib/ruby19/1.9.1/webrick/httpserver.rb:111:in `service'
    /usr/local/lib/ruby19/1.9.1/webrick/httpserver.rb:70:in `run'
    /usr/local/lib/ruby19/1.9.1/webrick/server.rb:183:in `block in start_thread'

errors

However, to recap: Does the Problem lie in the database connectors? This would mean one must wait for updated versions of the pg and probably mysql gem ...

greets
-act
lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Rocco Di Leo - 2009-08-26 15:02:44 UTC

Okay, i now reinstalled actionpack, rack and the pg-gem. Postgresql works now without any modification. Also the Encoding:: Error has disappeared. I don't know why the problem occured before but for now the problem is solved. When using MySQL, the 2.81-Version is needed as discussed. I will recheck on different machines and operation systems later since there must be some bug or unlucky condition somewhere which results in the problem before.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Manfred Stienstra - 2009-08-26 15:07:53 UTC

Rocco, first off: thanks for all the effort you're putting into this! Do you think you can do all your investigating first and post a short summary with proper formatting afterwards? It's becoming hard to find actual information in your torrent of posts.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by engineerDave - 2009-09-23 05:53:53 UTC

I get this error just by having quotes in the text being displayed.

ActionView::TemplateError (incompatible character encodings: UTF-8 and ASCII-8BIT) on line #30 ... app/views/blogs/index.html.erb:30:in concat' app/views/blogs/index.html.erb:30:inblock in _run_erb_app47views47blogs47index46html46erb' app/views/blogs/index.html.erb:27:in each' app/views/blogs/index.html.erb:27 app/controllers/blogs_controller.rb:33:inindex' internal:prelude:8:in synchronize' thin (1.2.4) lib/thin/connection.rb:76:inblock in pre_process' thin (1.2.4) lib/thin/connection.rb:74:in catch' thin (1.2.4) lib/thin/connection.rb:74:inpre_process' thin (1.2.4) lib/thin/connection.rb:57:in process' thin (1.2.4) lib/thin/connection.rb:42:inreceive_data' eventmachine (0.12.8) lib/eventmachine.rb:242:in run_machine' eventmachine (0.12.8) lib/eventmachine.rb:242:inrun' thin (1.2.4) lib/thin/backends/base.rb:57:in start' thin (1.2.4) lib/thin/server.rb:156:instart' thin (1.2.4) lib/thin/controllers/controller.rb:80:in start' thin (1.2.4) lib/thin/runner.rb:174:inrun_command' thin (1.2.4) lib/thin/runner.rb:140:in run!' thin (1.2.4) bin/thin:6:in<top (required)>' /usr/local/bin/thin:19:in load' /usr/local/bin/thin:19:in

'

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Adam S - 2009-09-23 06:27:20 UTC

I'm not sure why people are using multibtye characters in most html... shouldn't you be using html entities? [1] Most (all?) html can be rendered using just the standard ASCII character set.

I don't have any issues with encoding and the latest rails gems.

Please try checking your app for bad encodings... [2] You may have some invisible encodings in your templates or be using a non-standard version of the quote char...

[1] http://www.w3schools.com/tags/ref_entities.asp [2] http://github.com/adamsalter/bad_encodings-ruby19/tree

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by James Healy - 2009-09-23 06:34:45 UTC

"I'm not sure why people are using multibtye characters in most html... shouldn't you be using html entities?"

There's nothing saying you should use entities is there (other than for reserved chars like &, etc)?

Unicode has a hell of a lot more characters than there are HTML entities. As an example, what about asian, indic and arabic scripts?

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by yury - 2009-10-31 20:40:57 UTC

+1 to Hector's workaround-erb.diff patch, works for me too.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Adam S - 2009-11-09 01:10:51 UTC

This patch works for me (with Erb templates at least).

Nathan Weizenbaum has just made a commit to fix this issue in HAML.

http://github.com/nex3/haml/commit/76bd406875920079bb26445ddeb0d3842e825f01

After thinking about this and spending quite a lot of time trying to track it down I think the best fix would be for Ruby1.9/Rails to include a encoding converter ASCII-8BIT <=> UTF-8. If Rails included this then it would fix all the rails issues anyway. Clearly UTF-8 to ASCII-8BIT is a no-op, it's essentially the same as using force_encoding, but ASCII-8BIT to UTF-8 would mean that you could depend on all data to be valid UTF-8. It would really make life so much easier. It would also meant that Rails didn't have to 'force_encoding' anything. It would use the natural encoding converter for any string and if people wanted to run in a different encoding they could still specify it on the command-line. For full support it would actually require ASCII-8BIT <=> 'chosen encoding', but UTF-8 would be a great start. I know almost nothing about adding encoding converters to Ruby1.9, but this seems like the most forward compatible change. Data would pass through all levels, Rack, DB, Rails, and be compatible (at least for UTF-8, initially).

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by hkstar - 2009-11-18 07:03:18 UTC

Can this be merged into 2.3-stable, please?

It was freaking 6 months ago.

Hector's workaround-erb.diff solved the problem and as far as I'm concerned UTF8 is the standard and everyone should use it. Opinionated software, remember?

@Adam S: "I'm not sure why people are using multibtye characters in most html"

What on earth are you talking about? Almost every language other than english has multibyte characters and they are, of course, going to be placed in HTML files. Where else would they go? What a ridiculous comment.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Jeremy Kemper - 2009-11-18 07:47:02 UTC

The workaround is just as broken as it was six months ago. Please do investigate.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Vladimir Penkin - 2009-11-27 12:09:18 UTC

Rails 2.3.5 : Not working, Rails 2.3.5 + Hector patch: Working.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Jonas Nicklas - 2009-11-27 12:49:22 UTC

So the alternatives are: 1) Pretty much every real world Rails app anywhere is broken on Ruby 1.9 2) The patch is applied and we simply assume UTF-8 for templates. Which everyone uses anyway.

How is that broken? Since no one has provided a better solution over the last six months, shouldn't we just apply this, and if someone needs to change the encoding used in templates, then they can patch it properly so we they choose the encoding.

As mentioned above, Rails is oppinionated software, why can't we have an oppinion on what encoding people should use?

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Michael Hasenstein - 2009-11-27 14:19:00 UTC

I applied the one-line patch to my just installed Rails 2.3.5 - but it does not help. Well, it does help with one issue: I no longer get an error when a partial is to be rendered. Instead I now get an error later, where I call a helper function in the view which <%= some_function() %> which returns some HTML.

"incompatible character encodings: UTF-8 and ASCII-8BIT" once more.

Given these issues, how can ANYONE be using ruby 1.9.1 at this point? Or are those who are able to use it using ASCII as default encoding for all files? I (most certainly!) use UTF-8, as it should be in this world. The ASCII-60s and 70s and maybe 80s are long over...

I'm not (usually) concerned with the inner workings of ruby and rails, just use it (even though I consider myself "hard-core" in other fields I don't want to become an expert with everything). What I find disturbing is that I find no guidelines on Rails and Ruby 1.9.1. I just assumed it should be working by now, since I read a lot of "fixed ruby 1.9 compatibility issues" in Rails and Passenger.

Does (all of) this discussion mean it isn't so, it's still experimental? I cannot imagine my application is very special.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Manfred Stienstra - 2009-11-27 14:31:05 UTC

Given these issues, how can ANYONE be using ruby 1.9.1 at this point?

I assume nobody is running applications on 1.9. The encoding changes are in Ruby are pretty big and it will take a lot of work to resolve all the encoding issues in all libraries and Rails.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Mezza - 2009-11-27 17:33:31 UTC

With regards to the postgres pg gem (not the pure ruby version), I originally encountered issues with encoding with the 0.8.0 version of the gem, but the developers of the gem seem to have applied a patch which works fine in the following branch:

http://ruby-pg.rubyforge.org/svn/ruby-pg/branches/i17n-19-patches/

The relevant issue is here:

http://rubyforge.org/tracker/?func=detail&aid=25931&group_id=3214&atid=12398

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Anton Ageev - 2009-11-27 17:37:43 UTC

I assume nobody is running applications on 1.9. The encoding changes are in Ruby are pretty big and it will take a lot of work to resolve all the encoding issues in all libraries and Rails.

I am running rails application on 1.9.1.

I use two monkey patches: config/initializers/fix_renderable.rb (Hector's patch) and config/initializers/fix_params.rb.

And I patched postgres gem (http://github.com/antage/postgres) to force UTF-8 encoding for all strings returning from a database.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Valentin Nemcev - 2009-12-05 02:56:59 UTC

I'm also trying to run applications on 1.9.1. I'm not very familiar with rails internal structure, but I'm using it in few applications and i want to migrate them to ruby 1.9 to benefit from speed and memory efficiency (not talking about new Ruby features I want to use in future Rails projects).

But I can't!

I've tried all the patches and fixes I could find, but they are not working. I'm using Mysql for DB and Haml for templates and I get "incompatible character encodings: UTF-8 and ASCII-8BIT" when I try to render model attribute with Russian letters. Other UTF-8 strings are okay.

What additional information should I provide to help fixing this issue?

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by trevor - 2009-12-11 19:57:20 UTC

+1 Rails 2.3.5 + Hector patch: Working.

solved my problem with render partial and μm.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Thilo Utke - 2009-12-17 00:16:03 UTC

+1 Rails 2.3.5 + Hector patch is working for me too

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by James Conroy-Finn - 2009-12-17 12:40:38 UTC

@Jakub Instructions on patching pg to return UTF-8 strings are here: http://gist.github.com/215955 (the diff is http://gist.github.com/215956)

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Andrew Grim - 2009-12-21 20:15:13 UTC

Hector's patch works in the case where your default encoding is UTF-8, but doesn't respect the encoding specified by template itself. Using the latest tests in rails I was able to achieve both with this patch. It only affects ERB, but I believe that is where the bug lies anyway. ERB#src will always return strings encoded as either ASCII or ASCII-8BIT, regardless of both your default encoding and the encoding specified by the ERB string.

This doesn't appear to be an issue with rails 3 as Erubis is used by default, and the bug seems to be ERB specific.

Attached is a patch for 2-3-stable and also a little script that demonstrates the issue (for fun, change the script's encoding to ASCII-8BIT).

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Andreas Haller - 2010-01-21 19:04:53 UTC

ERB#src will always return strings encoded as either ASCII or ASCII-8BIT, regardless of both your default encoding and the encoding specified by the ERB string.

Is there a bug about this on ruby-lang.org?

Erb#src seems to behave strange, but rendering with Erb seems to just work. At least on ruby 1.9.2dev (2010-01-22 trunk 26370) [i386-darwin9.8.0]

# encoding: UTF-8
require 'erb'
template = ERB.new("This is IntéraΫiὉnäl Pöחyß")
puts template.src.encoding       # US-ASCII                     # This is not expected…
puts template.result             # This is IntéraΫiὉnäl Pöחyß   # … but it just works.
puts template.result.encoding    # UTF-8                        # This is just works, doesn't it?
lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Vladimir Penkin - 2010-02-03 07:26:06 UTC

I'm still having issues with UTF. With this patches:

Having troubles when trying to POST russian characters to controller.

lighthouse-import commented 13 years ago

Imported from Lighthouse. Comment by Marcello Barnaba - 2010-03-21 12:15:46 UTC

Hello,

here is my monkey patch (hack? :-) to fix this issue on current Rails 2.3.5 apps on 1.9.1, that doesn't involve copy-pasting code from ActionView. It is also available as a Gist on GitHub.

# Rails 2.3.5, Ruby 1.9. ERB returns templates with an ASCII-8BIT encoding, unless they contain
# an unicode character, and when you render a partial with unicode chars into a layout without,
# the infamous "incompatible character encodings: ASCII-8BIT and UTF-8" error comes out.
#
# This module monkey-patches module_eval into the ActionView::Base::CompiledTemplates module to
# convert the first argument encoding to UTF-8, if needed.
#
# Put it into lib/patches/compiled_templates.rb and require it into the config.after_initialize
# block of your environment.rb.
#
# LH ticket x-reference: https://rails.lighthouseapp.com/projects/8994/tickets/2188
#
# - vjt@openssl.it
#
module Patches
  module CompiledTemplates
    def self.extended(base)
      base.metaclass.alias_method_chain(:module_eval, :utf8)
    end

    def module_eval_with_utf8(*args, &block)
      if args.first.respond_to?(:encoding) && args.first.encoding != Encoding::UTF_8
        args.first.force_encoding(Encoding::UTF_8)
      end
      module_eval_without_utf8(*args, &block)
    end
  end

  begin
    RUBY_VERSION.to_f >= 1.9 &&
      ActionView::Base::CompiledTemplates.method(:module_eval_with_utf8)
  rescue NameError
    ActionView::Base::CompiledTemplates.extend Patches::CompiledTemplates
  end
end

Tested on 1.9.1-p378 and a big Rails app with unicode characters in templates :-).