davidfstr / rdiscount

Discount (For Ruby) Implementation of John Gruber's Markdown
http://dafoster.net/projects/rdiscount/
Other
753 stars 70 forks source link

UTF-8 characters break in Ruby 1.9 #11

Closed SFEley closed 14 years ago

SFEley commented 14 years ago

I'm building a Rails app using Ruby 1.9.1, and using RDiscount to manage a lot of my output. Everything's fast and beautiful, except for an exception that occurs when user-submitted data contains Unicode characters:

Encoding::CompatibilityError: incompatible character encodings: UTF-8 and ASCII-8BIT

FWIW, the specific example that broke it was a string containing the word Yogācāra (a form of Buddhism). It took a while to eliminate other factors and trace this error back to RDiscount. It seems that even if input goes in as UTF-8, the output comes out as ASCII with \x escapes.

I was able to resolve the issue in my application by forcing the encoding in my formatting helper:

# Renders the string passed as Markdown text, with proper UTF-8 management
def markdown(val)
  RDiscount.new(val, :smart).to_html.force_encoding('UTF-8')
end

So it's not a showstopper. But as more people make the slow migration to Ruby 1.9, this may begin to become a higher-visibility issue. Thanks for your time.

rtomayko commented 14 years ago

coerce output to same encoding as input under 1.9, closed by de10250a5bc718724db1afc885b4b2e27ecbb52b